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Preface 


Every modern operating system has at least one shell, and some have many. 
Some shells are command line—oriented, such as the shell discussed in this 
book. Others are graphical, like Windows Explorer or the Macintosh Finder. 
Some users will interact with the shell only long enough to launch their 
favorite application, and then never emerge from that until they log off. But 
most users spend a significant amount of time using the shell. The more you 
know about your shell, the faster and more productive you can be. 


Whether you are a system administrator, a programmer, or an end user, there 
are certainly occasions where a simple (or perhaps not so simple) shell script 
can save you time and effort, or facilitate consistency and repeatability for 
some important task. Even using an alias to change or shorten the name of a 
command you use often can have a significant effect. We’ll cover this and 
much more. 


As with any general programming language, there is more than one way to do 
a given task in the shell. In some cases, there is only one best way, but in 
most cases there are at least two or three equally effective and efficient ways 
to write a solution. Which way you choose depends on your personal style, 
creativity, and familiarity with different commands and techniques. This is as 
true for us as authors as it is for you as the reader. In most cases we will 
choose a single method and implement it. In a few cases we may choose a 
particular method and explain why we think it’s the best. We may also 
occasionally show more than one equivalent solution so you can choose the 
one that best fits your needs and environment. 


There is also sometimes a choice between a clever way to write some code, 
and a readable way. We will choose the readable way every time because 
experience has taught us that no matter how transparent you think your clever 
code is now, 6 or 18 months and 10 projects from now, you will be scratching 
your head asking yourself what you were thinking. Trust us: write clear code, 
and document it—you’ll thank yourself (and us) later. 


Who Should Read This Book 


This book is for anyone who uses a Unix or Linux system, as well as system 
administrators who may use several systems on any given day. With it, you 
will be able to create scripts that allow you to accomplish more, in less time, 
more easily, consistently, and repeatably than ever before. 


Anyone? Yes. New users will appreciate the sections on automating 
repetitive tasks, making simple substitutions, and customizing their 
environment to be more friendly and perhaps behave in more familiar ways. 
Power users and administrators will find new and different solutions to 
common tasks and challenges. Advanced users will have a collection of 
techniques they can use at a moment’s notice to put out the latest fire, without 
having to remember every little detail of syntax. 


Ideal readers include: 


= New Unix or Linux users who don’t know much about the shell, but want 
to do more than point and click 


= Experienced Unix or Linux users and system administrators looking for 
quick answers to shell scripting questions 


=» Programmers who work in a Unix or Linux (or even Windows) 
environment and want to be more productive 


= New Unix or Linux sysadmins, or those coming from a Windows 
environment who need to come up to speed quickly 


=» Experienced Windows users and sysadmins who want a more powerful 
scripting environment 


This book will only briefly cover basic and intermediate shell scripting—see 
Learning the bash Shell, 3rd Edition, by Cameron Newham (O’Reilly) and 
Classic Shell Scripting by Nelson H. F. Beebe and Arnold Robbins 
(O’Reilly) for more in-depth coverage. Instead, our goal is to provide 
solutions to common problems, with a strong focus on the “how to” rather 
than the theory. We hope this book will save you time when figuring out 
solutions or trying to remember syntax. In fact, that’s why we wrote this 
book: it’s what we wanted, one we could read through to get ideas, then refer 


to for practical working examples when needed. That way we wouldn’t have 
to remember the subtle differences between the shell, Perl, C, and so forth. 


This book assumes you have access to a Unix or Linux system (or see 
Recipes 1.14 through 1.18, or Recipe 15.4) and are familiar with logging in, 
typing basic commands, and using a text editor. You do not have to be root to 
use the vast majority of the recipes, though there are a few, particularly 
dealing with installing bash, where root access will be needed. 


About This Book 


This book covers bash, the GNU Bourne Again Shell, which is a member of 
the family of shells that includes the original Bourne shell, sh, the Korn shell, 
ksh, and the public domain Korn shell, pdksh. While these and other shells 
such as dash and zsh are not specifically covered, odds are that most of the 
scripts will work pretty well with them. 


You should be able to read this book cover to cover, and also just pick it up 
and read anything that catches your eye. But perhaps most importantly, we 
hope that when you have a question about how to do something or you need a 
hint, you will be able to easily find the right answer—or something close 
enough—and save time and effort. 


A great part of the Unix philosophy is to build simple tools that do one thing 
well, then combine them as needed. This combination of tools is often 
accomplished via a shell script because these commands, called pipelines, can 
be long or difficult to remember and type. Where appropriate, we’ll cover the 
use of many of these tools in the context of the shell script as the glue that 
holds the pieces together to achieve the goal. 


The first edition of this book was written using OpenOffice.org Writer 
running on whatever Linux or Windows machine happened to be handy, and 
kept in Subversion (see Appendix D). The nature of the Open Document 
Format facilitated many critical aspects of writing this book, including cross- 
references and extracting code (see Recipe 13.18). That source was later 
converted to DocBook for production. 


For the second edition, we’ve switched to Asciidoc and Git on O’Reilly’s 
Atlas system, which worked very well. We’re grateful to O’Reilly’s 
production and tools departments for their help. 


GNU Software 


bash and many of the other tools we discuss in this book are part of the GNU 
Project. GNU (pronounced guh-noo, like canoe) is a recursive acronym for 
“GNU’s Not Unix,” and the project dates back to 1984. Its goal is to develop 
a free (as in freedom) Unix-like operating system. 


Without getting into too much detail, what is commonly referred to as Linux 
is, in fact, a kernel with various supporting software as a core. The GNU 
tools are wrapped around it and it has a vast array of other software that may 
be included, depending on your distribution. However, the Linux kernel itself 
is not GNU software. 


The GNU Project argues that Linux should in fact be called “GNU/Linux,” 
and it has a good point, so some distributions (notably Debian) do this. 
Therefore, GNU’s goal has arguably been achieved, though the result is not 
exclusively GNU. 


The GNU Project has contributed a vast amount of superior software, notably 
including bash. There are GNU versions of practically every tool we discuss 
in this book, and while the GNU tools are more rich in terms of features and 
(usually) friendliness, they are also sometimes a little different. We discuss 
this in Recipe 15.3, though the commercial Unix vendors in the 1980s and 
1990s are also largely to blame for these differences. 


Enough (several books this size worth) has already been said about all of 
these aspects of GNU, Unix, and Linux, but we felt that this brief note was 
appropriate. See http://www.gnu.org for much more on the topic. 


A Note About Code Examples 


When we show an executable piece of shell scripting in this book, we 
typically show it in an offset area like this: 


$ ls 
a.out cong.txt def.conf file.txt more.txt zebra.list 


$ 


The first character is often a dollar sign ($) to indicate that this command has 
been typed at the bash shell prompt. (Remember that you can change the 
prompt, as described in Recipe 16.2, so your prompt may look very 
different.) The prompt is printed by the shell; you type the remainder of the 
line. Similarly, the last line in such an example is often a prompt (the $ 
again), to show that the command has ended execution and control has 
returned to the shell. 


The pound or hash sign (#) is a little trickier. In many Unix or Linux files, 
including bash shell scripts, a leading # denotes a comment, and we have 
used it that way in some of our code examples. But as the trailing symbol in a 
bash command prompt (instead of $), # means you are logged in as root. We 
only have one example that is running anything as root, so that shouldn’t be 
confusing, but it’s important to understand. 


When you see an example without the prompt string, we are showing the 
contents of a shell script. For several large examples we will number the lines 
of the script, though the numbers are not part of the script. 


We may also occasionally show an example as a session log or a series of 
commands. In some cases, we may cat one or more files so you can see the 
script and/or datafiles we’ll be using in the example or in the results of our 
operation, like this: 


$ cat data_file 
static header line1 
static header line2 
1 foo 

2 bar 

3 baz 


Many of the longer scripts and functions are available to download as well. 
See “Using Code Examples” for details. We have chosen to use 


!/usr/bin/env bash for these examples, where applicable, as that is more 
portable than the !/bin/bash you will see on Linux or a Mac. See Recipe 
15.1 for more details. 


Also, you may notice something like the following in many code examples: 
# cookbook filename: snippet_name 


That means that the code you are reading is available for download in our 
GitHub repository. You’ll find the code in something like 
/chXX/snippet_name, where chXX is the chapter and snippet_name is the 
name of the file. 


Useless Use of cat 


Certain Unix users take a positively giddy delight in pointing out 
inefficiencies in other people’s code. Most of the time this is constructive 
criticism gently given and gratefully received. 


Probably the most common case is the so-called “useless use of cat award” 
bestowed when someone does something like cat file | grep foo instead 
of simply grep foo file. In this case, cat is unnecessary and incurs some 
system overhead since it runs in a subprocess. Another common case would 
be cat file | tr '[A-Z]' '[a-z]' msteadoftr '[A-Z]' '[a-z]' < 
file. Sometimes using cat can even cause your script to fail (see Recipe 
19.8). 


But...(you knew that was coming, didn’t you?) sometimes unnecessarily 
using cat actually does serve a purpose. It might be a placeholder to 
demonstrate a fragment of a pipeline, with other commands later replacing it 
(perhaps even cat -n). Or it might be that placing the file near the left side 
of the code draws the eye to it more clearly than if it were hidden behind a < 
on the far-right side of the page. 


While we applaud efficiency and agree it is a goal to strive for, it isn’t as 
critical as it once was. We are not advocating carelessness and code bloat, 
we're just saying that processors aren’t getting any slower anytime soon. So 
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if you like cat, use it. 


A Note About Perl 


We made a conscious decision to avoid using Perl in our solutions as much as 
possible, though there are still a few cases where it makes sense. Perl is 
already covered elsewhere in far greater depth and breadth than we could 
ever manage here. And Perl solutions are generally much larger, with 
significantly more overhead, than ours. There is also a fine line between shell 
scripting and Perl scripting, and this is a book about shell scripting. 


Shell scripting is basically glue for sticking Unix programs together, whereas 
Perl incorporates much of the functionality of the external Unix programs 
into the language itself. This makes it more efficient and in some ways more 
portable, at the expense of being different and making it harder to efficiently 
run any external programs you still need. 


The choice of which tool to use often has more to do with familiarity than 
with any other reason. The bottom line is always getting the work done; the 
choice of tools is secondary. We’ll show you many ways to do things using 
bash and related tools. When you need to get your work done, you get to 
choose what tools you use. 


More Resources 


m Perl Cookbook, 2nd Edition, by Nathan Torkington and Tom Christiansen 
(O’Reilly) 


= Programming Perl, 4th Edition, by Larry Wall et al. (O’ Reilly) 
= Perl Best Practices, by Damian Conway (O’Reilly) 


= Mastering Regular Expressions, 3rd Edition, Jeffrey E. F. Friedl 
(O’ Reilly) 


= Learning the bash Shell, 3rd Edition, by Cameron Newham (O’ Reilly) 


= Classic Shell Scripting, Nelson H. F. Beebe and Arnold Robbins 
(O’ Reilly) 
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Conventions Used in This Book 

The following typographical conventions are used in this book: 

Plain text 
Indicates menu titles, menu options, menu buttons, and keyboard 
accelerators (such as Alt and Ctrl). 

Italic 
Indicates new terms, URLs, email addresses, filenames, file extensions, 


pathnames, directories, and Unix utilities. 


Constant width 


Indicates commands, options, switches, variables, attributes, keys, 
functions, types, classes, namespaces, methods, modules, properties, 
parameters, values, objects, events, event handlers, XML tags, HTML 
tags, macros, the contents of files, and the output from commands. 


Constant width bold 
Shows commands or other text that should be typed literally by the user. 


Constant width italic 


Shows text that should be replaced with user-supplied values. 


NOTE 


This icon signifies a general note. 


TIP 


This icon signifies a tip or suggestion. 


WARNING 
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| This icon indicates a warning or caution. 
| 


Using Code Examples 


Supplemental material (code examples, exercises, etc.) is available for 
download at https://github.com/vossenjp/bashcookbook-examples. 


This book is here to help you get your job done. In general, if example code 
is offered with this book, you may use it in your programs and 
documentation. You do not need to contact us for permission unless you’re 
reproducing a significant portion of the code. For example, writing a program 
that uses several chunks of code from this book does not require permission. 
Selling or distributing a CD-ROM of examples from O’Reilly books does 
require permission. Answering a question by citing this book and quoting 
example code does not require permission. Incorporating a significant 
amount of example code from this book into your product’s documentation 
does require permission. 


We appreciate, but do not require, attribution. An attribution usually includes 
the title, author, publisher, and ISBN. For example: “bash Cookbook, 2nd 
Edition, by Carl Albing and JP Vossen. Copyright 2018 Carl Albing and JP 
Vossen, 978-1-491-97533-6.” 


If you feel your use of code examples falls outside fair use or the permission 
given above, feel free to contact us at permissions@oreilly.com. 


GO Reilly Satari 
NOTE 


Safari (formerly Safari Books Online) is a membership-based training and 
reference platform for enterprise, government, educators, and individuals. 


Members have access to thousands of books, training videos, Learning Paths, 
interactive tutorials, and curated playlists from over 250 publishers, including 
O’Reilly Media, Harvard Business Review, Prentice Hall Professional, 
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Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, 
Adobe, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan 
Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, 
New Riders, McGraw-Hill, Jones & Bartlett, and Course Technology, among 
others. 


For more information, please visit http://oreilly.com/safari. 


We’ d Like to Hear from You 


Please address comments and questions concerning this book to the 
publisher: 

=» O’Reilly Media, Inc. 

= 1005 Gravenstein Highway North 

= Sebastopol, CA 95472 

m 800-998-9938 (in the United States or Canada) 

m 707-829-0515 (international or local) 

m 707-829-0104 (fax) 

We have a web page for this book, where we list errata, examples, and any 


additional information. You can access this page at 
http.//bit.ly/bash_cookbook_2E. 


You can find information about this book, code samples, errata, links, bash 
documentation, and more at the authors’ site, http://www.bashcookbook.com. 


Please drop by for a visit to learn, contribute, or chat. The authors would love 
to hear from you about what you like and don’t like about the book, what 
bash wonders you may have found, or lessons you have learned. To comment 
or ask technical questions about this book, send email to 
bookquestions@oreilly.com. 


For more information about our books, courses, conferences, and news, see 
our website at http://www.oreilly.com. 


Find us on Facebook: http://facebook.com/oreilly 
Follow us on Twitter: http://twitter.com/oreillymedia 
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Watch us on YouTube: http://www.youtube.com/oreillymedia 
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Chapter 1. Beginning bash 


What’s a shell, and why should you care about it? 


Any recent computer operating system (by recent, we mean since about 
1970) has some sort of user interface—some way of specifying commands 
for the operating system to execute. But in lots of operating systems, that 
command interface was really built in and there was only one way to talk to 
the computer. Furthermore, an operating system’s command interface would 
let you execute commands, but that was about all. After all, what else was 
there to do? 


The Unix operating system popularized the notion of separating the shell (the 
part of the system that lets you type commands) from everything else: the 
input/output system, the scheduler, memory management, and all of the other 
things the operating system takes care of for you (and that most users don’t 
want to care about). The shell was just one more program; it was a program 
whose job was executing other programs on behalf of users. 


But that was the beginning of a revolution. The shell was just another 
program that ran on Unix; if you didn’t like the standard one, you could 
create your own. So by the end of Unix’s first decade, there were at least two 
competing shells: the Bourne shell, sh (which was a descendant of the 
original Thompson shell), plus the C shell, csh. By the end of Unix’s second 
decade, there were a few more alternatives: the Korn shell, ksh, and the first 
versions of the bash shell. By the end of Unix’s third decade, there were 
probably a dozen different shells. 


You probably don’t sit around saying, “Should I use csh or bash or ksh 
today?” You’re probably happy with the standard shell that came with your 
Linux (or BSD or macOS or Solaris or HP/UX) system. But disentangling the 
shell from the operating system itself made it much easier for software 
developers (such as Brian Fox, the creator of bash, and Chet Ramey, the 
current developer and maintainer) to write better shells—you could create a 
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new shell without modifying the operating system itself. It was much easier 
to get a new shell accepted, since you didn’t have to talk some operating 
system vendor into building the shell into their system; all you had to do was 
package the shell so that it could be installed just like any other program. 


Still, you might be thinking that sounds like a lot of fuss for something that 
just takes commands and executes them. And you would be right—a shell 
that just let you type commands wouldn't be very interesting. However, two 
factors drove the evolution of the Unix shell: user convenience and 
programming. And the result is a modern shell that does much more than just 
accept commands. 


Modern shells are very convenient. For example, they remember commands 
that you’ve typed, and let you reuse those commands. Modern shells also let 
you edit those commands, so they don’t have to be the same each time. And 
modern shells let you define your own command abbreviations, shortcuts, 
and other features. For an experienced user, typing commands (e.g., with 
shorthand, shortcuts, and command completion) is a lot more efficient and 
effective than dragging things around in a fancy windowed interface. 


But beyond simple convenience, shells are programmable. There are many 
sequences of commands that you type again and again. Whenever you do 
anything a second time, you should ask, “Can’t I write a program to do this 
for me?” You can. A shell is also a programming language that’s specially 
designed to work with your computer system’s commands. So, if you want to 
generate a thousand MP3 files from WAV files, you can write a shell 
program (or shell script). If you want to compress all of your system’s 
logfiles, you can write a shell script to do it. Whenever you find yourself 
doing a task repeatedly, you should try to automate it by writing a shell 
script. There are more powerful scripting languages, like Perl, Python, and 
Ruby, but the Unix shell (whatever flavor of shell you’re using) is a great 
place to start. After all, you already know how to type commands; why make 
things more complex? 


1.1 Why bash? 
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Why is this book about bash, and not some other shell? Because bash is 
everywhere. It may not be the newest, and it’s arguably not the fanciest or the 
most powerful (though if not, it comes close), nor is it the only shell that’s 
distributed as open source software—but it is ubiquitous. 


The reason has to do with history. The first shells were fairly good 
programming tools, but not very convenient for users. The C shell added a lot 
of user conveniences (like the ability to repeat a command you’d just typed), 
but as a programming language it was quirky. The Korn shell, which came 
along next (in the early ’80s), added a lot of user conveniences, improved the 
programming language, and looked like it was on the path to widespread 
adoption. But ksh wasn’t open source software at first; 1t was a proprietary 
software product, and was therefore difficult to ship with a free operating 
system like Linux. (The Korn shell’s license was changed in 2000, and again 
in 2005.) 


In the late 1980s, the Unix community decided standardization was a good 
thing, and the POSIX working groups (organized by the IEEE) were formed. 
POSIX standardized the Unix libraries and utilities, including the shell. The 
standard shell was primarily based on the 1988 version of the Korn shell, 
with some C shell features and a bit of invention to fill in the gaps. bash was 
begun as part of the GNU Project’s effort to produce a complete POSIX 
system, which naturally needed a POSIX shell. 


bash provided the programming features that shell programmers needed, plus 
the conveniences that command-line users liked. It was originally conceived 
as an alternative to the Korn shell, but as the free software movement became 
more important, and as Linux became more popular, bash quickly 
overshadowed ksh. 


As a result, bash is the default user shell on every Linux distribution we 
know about (there are a few hundred Linux distros, so there are probably a 
few with some oddball default shell), as well as macOS (and the earlier OS X 
versions). It’s also available for just about every other Unix operating system, 
including BSD Unix and Solaris. In the rare cases where bash doesn’t ship 
with the operating system, it’s easy to install. It’s even available for 
Windows, via Cygwin and also the new Linux Subsystem (Ubuntu). bash is 
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both a powerful programming language and a good user interface, and you 
won’t find yourself sacrificing keyboard shortcuts to get elaborate 
programming features. 


You can’t possibly go wrong by learning bash. The most common default 
shells are the old Bourne shell and bash, which is mostly Bourne shell- 
compatible. One of these shells is certainly present on any modern, major 
Unix or Unix-like operating system. And as noted, if bash isn’t present you 
can always install it. But there are other shells. In the spirit of free software, 
the authors and maintainers of all of these shells share ideas. If you read the 
bash change logs, yov’ll see many places where a feature was introduced or 
tweaked to match behavior on another shell. But most people won’t care. 
They’ Il use whatever is already there and be happy with it. So if you are 
interested, by all means investigate other shells. There are many good 
alternatives, and you may find one you like better—though it probably won’t 
be as ubiquitous as bash. 


l.2 The bash Shell 


bash is a shell: a command interpreter. The main purpose of bash (or of any 
shell) is to allow you to interact with the computer’s operating system so that 
you can accomplish whatever you need to do. Usually that involves 
launching programs, so the shell takes the commands you type, determines 
from that input what programs need to be run, and launches them for you. 
You will also encounter tasks that involve a sequence of actions to perform 
that are recurring, or very complicated, or both. Shell programming, usually 
referred to as shell scripting, allows you to automate these tasks for ease of 
use, reliability, and reproducibility. 


In case you’re new to bash, we’ll start with some basics. If you’ve used Unix 
or Linux at all, you probably aren’t new to bash—but you may not have 
known you were using it. bash is really just a language for executing 
commands—so the commands you’ ve been typing all along (e.g., ls, cd, grep, 
cat) are, in a sense, bash commands. Some of these commands are built into 
bash itself; others are separate programs. For now, it doesn’t make a 
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difference which are which. 


We’ll end this chapter with a few recipes for getting bash. Most systems 
come with bash preinstalled, but a few don’t. Even if your system comes with 
bash, it’s always a good idea to know how to get and install 1it—new 
versions, with new features, are released from time to time. 


If you’re already running bash, and are somewhat familiar with it, you may 
want to go straight to Chapter 2. You are not likely to read this book in order, 
and if you dip into the middle, you should find some recipes that demonstrate 
what bash is really capable of. But first, the basics. 


1.3 Decoding the Prompt 


Problem 


You'd like to know what all the punctuation on your screen means. 


Solution 


All command-line shells have some kind of prompt to alert you that the shell 
is ready to accept your input. What the prompt looks like depends on many 
factors including your operating system type and version, shell type and 
version, distribution, and how someone else may have configured it. In the 
Bourne family of shells, a trailing $ in the prompt generally means you are 
logged in as a regular user, while a trailing # means you are root. The root 
account is the administrator of the system, equivalent to the System account 
on Windows (which is even more powerful than the Administrator account). 
root is all-powerful and can do anything on a typical Unix or Linux system. 


Default prompts also often display the path to the directory that you are 
currently in; however, they usually abbreviate it, so a ~ means you are in your 
home directory. Some default prompts may also display your username and 
the name of the machine you are logged into. If that seems silly now, it won’t 
when you’re logged into five machines at once, possibly under different 
usernames. 
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Here is a typical Linux prompt for a user named jp on a machine called 
adams, sitting in the home directory. The trailing $ indicates this is a regular 
user, not root: 


jp@adams:~$ 


Here’s the prompt after changing to the /tmp directory. Notice how ~, which 
really meant /home/jp, has changed to /tmp: 


jp@adams: /tmp$ 


Discussion 


The shell’s prompt is the thing you will see most often when you work at the 
command line, and there are many ways to customize it more to your liking. 
But for now, it’s enough to know how to interpret it. Of course, your default 
prompt may be different, but you should be able to figure out enough to get 
by for now. 


There are some Unix or Linux systems where the power of root may be 
shared, using commands like su and sudo. Or root may not even be all- 
powerful, if the system is running some kind of mandatory access control 
(MAC) system such as the NSA’s SELinux. 


See Also 


= Recipe 1.4, “Showing Where You Are” 
= Recipe 14.19, “Using sudo More Securely” 
= Recipe 16.2, “Customizing Your Prompt” 


= Recipe 17.15, “Using sudo on a Group of Commands” 


1.4 Showing Where You Are 


23 


Problem 


You are not sure what directory you are in, and the default prompt is not 
helpful. 


Solution 
Use the pwd builtin command, or set a more useful prompt (as described in 
Recipe 16.2). For example: 


bash-4.3$ pwd 
/tmp 


bash-4.3$ export PS1i='[\u@\h \w]$ ' 
[jp@solaris8 /tmp]$ 
Discussion 


pwd stands for print working directory and takes two options. -L displays 
your logical path and is the default. -P displays your physical location, which 
may differ from your logical path if you have followed a symbolic link. 
Similarly, the cd command also provides -P and -L switches: 


bash-4.3$ pwd 
/tmp/dir2 


bash-4.3$ pwd -L 
/tmp/dir2 


bash-4.3$ pwd -P 
/tmp/dir1 


See Also 


m Recipe 16.2, “Customizing Your Prompt” 


1.5 Finding and Running Commands 
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Problem 


You need to find and run a particular command under bash. 


Solution 


Try the type, which, apropos, locate, slocate, find, and ls commands. 


Discussion 


bash keeps a list of directories in which it should look for commands in an 
environment variable called PATH. The bash builtin type command searches 
your environment (including aliases, keywords, functions, builtins, 
directories in $PATH, and the command hash table) for executable commands 
matching its arguments and displays the type and location of any matches. It 
has several options, notably the -a flag, which causes it to print all matches 
instead of stopping at the first one. The which command is similar but only 
searches your $PATH (and csh aliases). It may vary from system to system 
(it’s usually a csh shell script on BSD, but a binary on Linux), and usually 
has a -a flag like type. Use these commands when you know the name of a 
command and need to know exactly where it’s located, or to see if it’s on this 
computer. For example: 


$ type which 
which is hashed (/usr/bin/which) 


$ type ls 
ls is aliased to ‘ls -F -h' 


$ type -a ls 
ls is aliased to ‘ls -F -h' 


ls is /bin/ls 


$ which which 
/usr/bin/which 


Almost all commands come with some form of help on how to use them. 


25 


Usually there is online documentation called manpages, where “man” is short 
for manual. These are accessed using the man command, so man Ls will give 
you documentation about the /s command. Many programs also have a built- 
in help facility, accessed by providing a “help me” argument such as -h or - - 
help. Some programs, especially on other operating systems, will give you 
help if you don’t give them arguments. Some Unix commands will also do 
that, but a great many of them will not. This is due to the way that Unix 
commands fit together into something called pipelines, which we’ ll cover 
later. But what if you don’t know or can’t remember the name of the 
command you need? apropos searches manpage names and descriptions for 
regular expressions supplied as arguments. This is incredibly useful when 
you don’t remember the name of the command you need. This is the same as 
man -k: 


$ apropos music 
cms (4) - Creative Music System device driver 


$ man -k music 
cms (4) - Creative Music System device driver 


locate and slocate consult database files about the system (usually compiled 
and updated by a job run from the scheduler system cron) to find files or 
commands almost instantly. The location of the actual database files, what is 
indexed therein, and how often it is checked may vary from system to system. 
Consult your system’s manpages for details. s/ocate (secure locate) stores 
permission information (in addition to filenames and paths) so that it will not 
list programs to which the user does not have access. On most Linux systems, 
locate is a symbolic link to slocate; other systems may have separate 
programs, or may not have s/ocate at all. Here’s an example: 


$ locate apropos 
/usr/bin/apropos 
/usr/share/man/de/mani/apropos.1.gz 
/usr/share/man/es/mani/apropos.1.gz 
/usr/share/man/it/mani/apropos.1.gz 
/usr/share/man/ja/mani/apropos.1.gz 


/usr/share/man/mani/apropos.1.gz 


For details on the find command, see Chapter 9. 


Last but not least, try using /s. Remember, if the command you wish to run is 
in your current directory, you must prefix it with a ./ since the current 


working directory is usually not in your $PATH for security reasons (see 
Recipes 14.3 and 14.10). 


See Also 

= help type 

m man which 

=m man apropos 

m man locate 

m man slocate 

m man find 

m man ls 

= Chapter 9 

= Recipe 4.1, “Running Any Executable” 
= Recipe 14.3, “Setting a Secure $PATH” 
= Recipe 14.10, “Adding the Current Directory to the $PATH” 


1.6 Getting Information About Files 


Problem 


You need more information about a file, such as what it is, who owns it, if 
it’s executable, how many hard links it has, or when it was last accessed or 
changed. 
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Solution 


Use the /s, stat, file, or find commands: 


$ touch /tmp/sample_file 


$ ls /tmp/sample_file 
/tmp/sample_file 


$ ls -l /tmp/sample_file 
-fw-r--r-- 1 jp jp 0 Dec 18 15:03 /tmp/sample_file 


$ stat /tmp/sample_file 
File: "/tmp/sample_file" 


Size: 0 Blocks: 0 IO Block: 4096 Regular File 
Device: 303h/771d Inode: 2310201 Links: 1 

Access: (0644/-rw-r--r--) Uid: ( 501/ jp) Gid: ( 501/ 
jp) 


Access: Sun Dec 18 15:03:35 2005 
Modify: Sun Dec 18 15:03:35 2005 
Change: Sun Dec 18 15:03:42 2005 


$ file /tmp/sample_file 
/tmp/sample_file: empty 


$ file -b /tmp/sample_file 
empty 


$ echo '#!/bin/bash -' > /tmp/sample_file 


$ file /tmp/sample_file 
/tmp/sample_file: Bourne-Again shell script text executable 


$ file -b /tmp/sample_file 
Bourne-Again shell script text executable 


For much more on the find command, see Chapter 9. 


Discussion 


The command ls shows only filenames, while -l provides more details 
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about each file. /s has many options; consult the manpage on your system for 
the ones it supports. Useful options include: 


-a 
Do not hide files starting with . (dot). 
-A 


Like -a, but skips the two common directories . (dot) and .. (dot dot), 
since they are present in virtually every directory. 


Show the type of file with one of several trailing type designators. 


A slash (/) indicates that the file is a directory, an asterisk (*) means the 
file is executable, an at sign (@) indicates a symbolic link, an equals sign 


(=) is a socket, and a pipe or vertical bar (|) is a FIFO (first in, first out) 
buffer. 


Use the long listing format. 


Show information about the linked file, rather than the symbolic link 
itself. 


Quote names (GNU extension, not supported on all systems). 


Reverse the sort order. 


-R 


Recurse through subdirectories. 


-S 
Sort by file size. 
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-1 
Use the short format, but with only one file per line. 
stat, file, and find all have many options that control the output format; see 


the manpages on your system for supported options. For example, these 
options produce output that is similar to ls -l: 


$ ls -l /tmp/sample_file 
-fw-r--r-- 1 jp jp 14 Dec 18 15:04 
/tmp/sample_file 


$ stat -c'%A %h %U %G %s %y %n' /tmp/sample_file 
-fw-r--r-- 1 jp jp 14 Sun Dec 18 15:04:12 2005 /tmp/sample_file 


$ find /tmp/ -name sample_file -printf '%m %n %u %g %t %p' 
644 1 jp jp Sun Dec 18 15:04:12 2005 /tmp/sample_file 


Not all operating systems and versions have all of these tools. For example, 
Solaris does not include stat by default. 


It is also worth pointing out that directories are nothing more than files that 
the operating system knows to treat specially, so the commands shown here 
will work just fine on directories, though sometimes you may need to modify 
a command to get the behavior you want. For example, use ls -d to list 


information about the directory itself, rather than just Ls (which lists the 
contents of the directory). 


See Also 
= man ls 
= man stat 


= man file 


man find 


Chapter 9 
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1.7 Showing All Hidden (Dot Files in the 
Current Directory 


Problem 


You want to see only hidden (dot) files in a directory to edit a file you’ve 
forgotten the name of or remove obsolete files. Ls -a shows all files, 
including normally hidden ones, but that is often too noisy, and ls -a .* 
does more than you think it will, or more than you want. 


Solution 


Use ls -d along with whatever other criteria you have. For example: 


ls -d .* 
ls -d .b* 
ls -d .[!.]* 
ls -d .*/ 


Since every normal directory contains a. and .., you don’t need to see those. 
You can use ls -A to list all the files in a directory except those two. For 
other commands where you list files with a wildcard (i.e., pattern), you can 
construct your wildcard in such a way that. and .. don’t match: 


$ grep -l 'PATH' ~/.[!.]* 
/home/jp/.bash_history 
/home/jp/.bash_profile 

$ 


Discussion 


Due to the way the shell handles file wildcards, the sequence .* does not 
behave as you might expect or desire. The way filename expansion or 
globbing works is that any string containing the characters *, ?, or [ is treated 
as a pattern, and replaced by an alphabetically sorted list of filenames 
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matching the pattern. * matches any string, including the null string, while ? 
matches any single character. Characters enclosed in [] specify a list or range 
of characters, any of which will match. There are also various extended 
pattern-matching operators that we’re not going to cover here (see “Pattern- 
Matching Characters” and “extglob Extended Pattern-Matching Operators” in 
Appendix A). So, *.txt means any file ending in .txt, while *txt means any 
file ending in txt (no dot). f?0 would match foo or fao but not fooo. Given 
that, you might think that .* would match any file beginning with a dot. 


The problem is that .* matches both the . and .. directories (present in every 
directory), which are then both displayed along with any other filenames 
beginning with a dot. When Js is given a directory name it doesn’t just list 
that directory name, but also the contents of that directory. Instead of getting 
just the dot files in the current directory, then, you get those files plus all the 
files and directories in the current directory (.), all the files and directories in 
the parent directory (..), and the names and contents of any subdirectories in 
the current directory that start with a dot. This can be very confusing, to say 
the least, and is usually more than you want. 


You can experiment with the same /s command with -d and without, then try 
echo . *. The echo command simply shows you what the shell expands 
your .* into, which would become the arguments to the /s command. 


Try echo .[!.]* also. .[!.]* is a filename expansion pattern where [ ] 
denotes a list of characters to match, but the leading ! negates the list. So 
here we are looking for a dot, followed by any character that is not a dot, 
followed by any number of any characters. You may also use ^ to negate a 
character class, but ! is specified in the POSIX standard and thus is more 
portable. 


There is one other special case in the /s command that helps out here. If the - 
d option is specified and if the filename pattern ends with a slash, then only 
directories that match that pattern, rather than all filenames that match, are 
displayed by the /s command. For example: 


$ ls -d .v* 
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.vim .viminfo .vimrc 
$ ls -d .v*/ 
.vin 


$ 


The first command shows the three filenames that begin with .v without 
listing (because of the -d option) the contents of any that might be a 
directory. The second command in this example uses a trailing slash on the 
pattern (.v*/) so the /s command only shows directories that match the 
pattern; in this case that is just the directory named .vim and no other. 


WARNING 


If you see double slashes on the output of the ls -d .v*/ command, like this: | 


$ ls -d .v*/ 


.vim// 


$ 


that’s likely because you may actually be running an alias for /s that includes 
the -F flag. Use a backslash in front of the command name to avoid any alias: 


$ \ls -d .v*/ 
.vim/ 


$ 


Some combinations are just difficult to match. .[!.]* will miss a file named 
..foo. You could add something like .??* to match anything starting with a 
dot that is also at least three characters long, but ls -d .[!.]* .??* will 
then display anything that matches both patterns twice. Or you can use .??* 
alone, but that will miss files like .a: 


$ touch ..foo .a .normal_dot_file normal_file 


$ ls -a 


33 


. ..foo .a .normal_dot_file normal_file 


$ ls -d .??* 
..foo .normal_dot_file 


$ ls -d .[!.]* 
.a .normal_dot_file 


$ ls -d .[!.]* .??* | sort -u 
.. foo 

.a 

.normal_dot_file 


Which you use depends on your needs and environment; there is no good 
one-size-fits-all solution. 


NOTE 


You can use echo * as an emergency substitute for /s if the /s command is 
corrupt or not available for some reason. This works because * is expanded by 
the shell to everything in the current directory, which results in a list similar to 
what you’d get with Ls. 


See Also 


m man ls 

= Question 18 in the GNU Core Utilities FAQ 

= Section 2.11 in the Unix FAQs 

= “Pattern-Matching Characters” in Appendix A 

= “extglob Extended Pattern-Matching Operators” in Appendix A 


1.8 Using Shell Quoting 


Problem 
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You need a rule of thumb for using command-line quoting. 


Solution 


Enclose a string in single quotes unless it contains elements that you want the 
shell to interpolate. 


Discussion 


Unquoted text and even text enclosed in double quotes is subject to shell 
expansion and substitution. Consider: 


$ echo A coffee is $5?! 
A coffee is ?! 


$ echo "A coffee is $5?!" 
-bash: !": event not found 


$ echo 'A coffee is $5?!' 
A coffee is $5?! 


In the first example, $5 1s treated as a variable to expand, but since it doesn’t 
exist it is set to null. In the second example, the same is true, but we never 
even get there because ! is treated as a history substitution, which fails in this 
case because it doesn’t match anything in the history. The third example 
works as expected. 


To mix some shell expansions with some literal strings you may use the shell 
escape character \ or change your quoting. The exclamation point is a special 
case because the preceding backslash escape character is not removed. You 
can work around that by using single quotes or a trailing space, as shown 
here: 


$ echo 'A coffee is $5 for' "SUSER" '?!' 
A coffee is $5 for jp ?! 


$ echo "A coffee is \$5 for SUSER?\!" 
A coffee is $5 for jp?\! 
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$ echo "A coffee is \$5 for SUSER?! " 
A coffee is $5 for jp?! 


Also, you can’t embed a single quote inside single quotes, even if using a 
backslash, since nothing (not even the backslash) is interpolated inside single 
quotes. But you can work around that by using double quotes with escapes, or 
by escaping a single quote outside of surrounding single quotes. 


# We'll get a continuation prompt since we now have unbalanced quotes 
$ echo 'SUSER won't pay $5 for coffee.' 


> AC 


# WRONG 
$ echo "SUSER won't pay $5 for coffee." 
jp won't pay for coffee. 


# Works 
$ echo "SUSER won't pay \$5 for coffee." 
jp won't pay $5 for coffee. 


# Also works 


$ echo 'I won'\''t pay $5 for coffee.' 
I won't pay $5 for coffee. 


See Also 


= Chapter 5 for more about shell variables and the VAR syntax 


= Chapter 18 for more about ! and the history commands 


1.9 Using or Replacing Builtins and External 
Commands 


Problem 


You want to replace a builtin command with your own function or external 
command, and you need to know exactly what your script is executing (e.g., 
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/bin/echo or the builtin echo). Or you’ve created a new command and it may 
be conflicting with an existing external or builtin command. 


Solution 


Use the type and which commands to see if a given command exists and 
whether it is built in or external: 


$ type cd 
cd is a shell builtin 


$ type awk 
awk is /usr/bin/awk 


$ which cd 

/usr/bin/which: no cd in 

(/bin:/sbin: /usr/bin: /usr/sbin:/usr/local/bin:/usr/ \ 
Llocal/sbin: /usr/bin/X11: /usr/X11R6/bin: /root/bin) 


$ which awk 
/usr/bin/awk 


Discussion 


A builtin command is just that; it is built into the shell itself, while an 
external command is an external file launched by the shell. The external file 
may be a binary, or it may be a shell script itself, and it is important to 
understand the difference for a couple of reasons. First, when you are using a 
given version of a particular shell, builtins will always be available but 
external programs may or may not be installed on a particular system. 
Second, if you give one of your own programs the same name as a builtin, 
you will be very confused about the results since the builtin will always take 
precedence (see Recipe 19.4). It is possible to use the enable command to 
turn builtin commands off and on, though we strongly recommend against 
doing so unless you are absolutely sure you understand what you are doing. 
enable -a will list all builtins and their enabled or disabled status. 


One problem with builtin commands is that you generally can’t use a -h or - 
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-help option to get usage reminders, and if a manpage exists it’s often just a 
pointer to the large bash manpage. That’s where the help command, which is 
itself a builtin, comes in handy. help displays help about shell builtins: 


help: help [-dms] [pattern ...] 
Display information about builtin commands. 


Displays brief summaries of builtin commands. If PATTERN is 
specified, gives detailed help on all commands matching PATTERN, 
otherwise the list of help topics is printed. 


Options: 
-d output short description for each topic 
-m display usage in pseudo-manpage format 
-S output only a short usage synopsis for each topic 
matching 
PATTERN 
Arguments: 


PATTERN Pattern specifying a help topic 


Exit Status: 
Returns success unless PATTERN is not found or an invalid option is 
given. 


When you need to redefine a builtin you use the builtin command to avoid 
loops. For example, we can define a shell function (see Recipe 10.4) to 
change how the cd command works: 


cd () { 
builtin cd "S@" 
echo "SOLDPWD --> SPWD" 


To force the use of an external command instead of any function or builtin 
that would otherwise have precedence, use enable -n, which turns off shell 
builtins, or command, which ignores shell functions. For example, to use the 
test found in $PATH instead of the shell builtin version, type enable -n test 
and then run test. Or, use command ls to use the native /s command rather 


38 


than any /s function you may have created. 


See Also 

m man which 

=» help help 

m help builtin 

=» help command 

=» help enable 

=» help type 

= Recipe 10.4, “Defining Functions” 

= Recipe 19.4, “Naming Your Script “test” 
= “Builtin Shell Variables” in Appendix A 


1.10 Determining if You Are Running 
Interactively 


Problem 


You have some code you want to run only if you are (or are not) running 
interactively. 


Solution 
Use the case statement in Example 1-1. 


Example l-1. chOl/interactive 


#!/usr/bin/env bash 
# cookbook filename: interactive 


case "$-" in 
*i*) # Code for interactive shell here 
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*)  # Code for noninteractive shell here 
esac 


Discussion 


$- is a string listing of all the current shell option flags. It will contain i if the 
shell is interactive. 


You may also see code like the following (this will work, but the solution in 
Example 1-1 is the preferred method): 


if [ -n "SPS1" ]; then 

echo This shell is interactive 
else 

echo This shell is not interactive 
fi 


See Also 


m help case 
m help set 


= Recipe 6.14, “Branching Many Ways”, for more explanation of the case 
statement 


1.11 Setting bash as Your Default Shell 


Problem 


You’re using a BSD system, Solaris, or some other Unix variant for which 
bash isn’t the default shell. You’re tired of starting bash explicitly all the 
time, and want to make bash your default shell. 


Solution 
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First, make sure bash is installed. Try typing bash --version at a command 
line. If you get a version, it’s installed: 


$ bash --version 
GNU bash, version 3.00.16(1)-release (1386-pc-solaris2.10) 
Copyright (C) 2004 Free Software Foundation, Inc. 


If you don’t see a version number, you may be missing a directory from your 
path. chsh -Lorcat /etc/shells may give you a list of valid shells on 
some systems. Otherwise, ask your system administrator where bash is, or if 
it can be installed. 


chsh -l provides a list of valid shells on Linux, but opens an editor and 

allows you to change settings on BSD. -l is not a valid option to chsh on 
macOS, but just running chsh will open an editor to allow you to change 

settings, and chpass -s shell will change your shell. 


If bash is installed, use the chsh -s command to change your default shell: 
for example, chsh -s /bin/bash. If for any reason that fails, try chsh, 
passwd -e, passwd -l, chpass, or usermod -s /usr/bin/bash. If you 
still can’t change your shell ask your system administrator, who may need to 
edit the /etc/passwd file. On most systems, /etc/passwd will have lines of the 
form: 


cam: pK1Z9BCJbzCrBNrkjRUdUiTtFOh/ :501:100: Cameron 
Newham: /home/cam: /bin/bash 
cc: kfDKDjfkeDIKJySFgJFWErrElpe/:502:100:Cheshire Cat: /home/cc:/bin/bash 


As root, you can just edit the last field of the lines in the password file to the 
full pathname of whatever shell you choose. If your system has a vipw 
command, you should use it to ensure password file consistency. 


WARNING | 


Some systems will refuse to allow a login shell that is not listed in /etc/shells. If 
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| bash is not listed in that file, you will have to have your system administrator 
add it. 


Discussion 


Some operating systems, notably the BSD Unixes, typically place bash in the 
/usr partition. You may want to think twice about changing root’s shell on 
such systems. If the system runs into trouble while booting, and you have to 
work on it before /usr is mounted, you’ ve got a real problem: there isn’t a 
shell for root to use. Therefore, it’s best to leave the default shell for root 
unchanged. However, there’s no reason not to make bash the default shell for 
regular user accounts. And it goes without saying that it’s bad practice to use 
the root account unless it’s absolutely necessary. Use your regular (user) 
account whenever possible. With commands like sudo, you should very 
rarely need a root shell. 


If all else fails, you can probably replace your existing login shell with bash 
using exec, but this is not for the faint of heart. See “A7) How can I make 
bash my login shell?” in the bash FAQ. 


See Also 


m man chsh 
m man passwd 
m man chpass 
= /etc/shells 


= “A7) How can I make bash my login shell?” from 
Stp://ftp.cwru.edu/pub/bash/FAQ 


= Recipe 1.12, “Keeping bash Updated” 
= Recipe 14.13, “Setting Permissions” 


= Recipe 14.19, “Using sudo More Securely” 
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1.12 Keeping bash Updated 


Problem 


This isn’t really a normal recipe, and it probably goes without saying, but it is 
a topic no one can afford to ignore and we wanted to say it anyway. You need 
to keep both bash and your entire system up-to-date with security patches. 


Solution 


Keeping your entire system up-to-date is out of the scope of this book; 
consult your system administrator and documentation. 


How you keep bash up-to-date depends on how you got it in the first place. 
In the ideal case, it’s part of the system in general and updated when the 
system is updated. That may not be the case if you are using a very old 
system that is no longer supported, in which case you need to update the 
entire thing. If you are using your package system and the originating 
repository is still actively maintained, you should get updates from there—for 
example, from Extra Packages for Enterprise Linux (EPEL) or an Ubuntu 
Personal Package Archive (PPA). 


If you installed from source, it will be up to you to update your source and 
rebuild as appropriate. 


Discussion 


We all know why we need to stay up-to-date, but we’ll cite one well-known 
reason anyway: CVE-2014-6271, better known as the shellshock 
vulnerability. 


See Also 


= /ttps://fedoraproject.org/wiki/EPEL 
€ /Attps://launchpad.net/ubuntu/+ppas 
= /ttps://en.wikipedia.org/wiki/Shellshock_(software_bug) 
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= Recipe 1.13, “Getting bash for Linux” 

= Recipe 1.14, “Getting bash for xBSD” 

= Recipe 1.15, “Getting bash for macOS” 

= Recipe 1.16, “Getting bash for Unix” 

= Recipe 1.17, “Getting bash for Windows” 


1.13 Getting bash for Linux 


Problem 


You want to get bash for your Linux system, or you want to make sure you 
have the latest version. 


Solution 


bash is included in virtually all modern Linux distributions. To make sure 
you have the latest version available for your distribution, use the 
distribution’s built-in packaging tools. You must be root or have sudo or the 
root password to upgrade or install applications. 


Some Linux distributions (notably the Debian family) use the Debian 
Almquist shell, or dash, as /bin/sh because it is smaller and thus runs a bit 
faster than bash. That switchover caused a lot of confusion when scripts 
assumed that /bin/sh was really bash, as scripts using bash features with 
#! /bin/sh would fail. See Recipe 15.3 for more details. 


For Debian and Debian-derived systems such as Ubuntu and Linux Mint, use 
one of the various graphical user interface (GUI) tools or a command-line 
tool such as apt-get, aptitude, or apt to make sure it is installed and current: 


apt-get update && apt-get install bash bash-completion bash-doc 


For Red Hat distributions, including Fedora, Community OS (CentOS), and 
Red Hat Enterprise Linux (RHEL), use the GUI Add/Remove Applications 
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tool. For a command line only, use: 
yum update bash 


For SUSE, use either the GUI or terminal version of YaST. You may also use 
the command-line rpm tool. 


Discussion 


It’s impossible to cover every Linux distribution and difficult even to cover 
the major ones, as they are all evolving rapidly. Fortunately, much of that 
evolution is in the area of ease of use, so it should not be very difficult to 
figure out how to install software on your distribution of choice. 


When using LiveCDs, software updates and installations will most likely fail 
due to the read-only media. Versions of such distributions that have been 
installed to a hard disk should be updatable. 


If you are wondering what version of bash is available in a given Linux 
distribution, search for the distro on DistroWatch.com and consult the 
package table. For example, https://distrowatch.com/table.php? 
distribution=mint shows what you see in Table 1-1. 


Table 1-1. Bash versions in Linux Mint 


Package 18 17.3 16 15 14 13 12 11 10 
sarah rosa petra olivia nadia maya lisa katya julia 


bash 4.3 4.3 4.2 4.2 4.2 42 42 42 4.1 
(4.4) 


See Also 
€u http:/wiki.linuxquestions.org/wiki/Installing Software 
= Debian: http://www.debian.org/doc/ 


= dash: https://en.wikipedia.org/wiki/Almquist_shell and 
https://wiki.ubuntu.com/DashAsBinSh 
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= http://www.debianuniverse.com/readonline/chapter/06 
= Fedora: Attps://fedoraproject.org/wiki/Yum 

= Red Hat Enterprise Linux: hitp://red.ht/2uWkEs0 

= SuSE: hitps://www.suse.com/documentation/ 

= OpenSuSE: hitps://doc.opensuse.org/ 

m Recipe 1.11, “Setting bash as Your Default Shell” 

= Recipe 1.12, “Keeping bash Updated” 


1.14 Getting bash for xBSD 


Problem 


You want to get bash for your FreeBSD, NetBSD, or OpenBSD system, or 
you want to make sure you have the latest version. 


Solution 
According to Chet Ramey’s bash page: 


Bash-4.3 is included as part of the FreeBSD ports collection, the OpenBSD 
packages collection, and the NetBSD packages collection. 


To see if bash is installed, check the /etc/shells file. To install or update bash, 
use the pkg _add command. If you are an experienced BSD user, you may 
prefer using the ports collection, but we will not cover that here. 


If you are wondering what version of bash is available in a given BSD 
distribution, search for the distro on DistroWatch.com and consult the 
package table. For example: 


= /ttps://distrowatch.com/table.php?distribution=freebsd 
€ /Attps://distrowatch.com/table.php?distribution=netbsd 
= https://distrowatch.com/table.php?distribution=openbsd 
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= /ttps://distrowatch.com/table.php?distribution=trueos 


For FreeBSD, use the command: 
pkg_add -vr bash 


For NetBSD, browse to Application Software for NetBSD and locate the 
latest bash package for your version and architecture, then use a command 
such as: 


pkg_add -vu ftp://ftp.netbsd.org/pub/NetBSD/packages/pkgsrc- 
2005Q3/NetBSD-2.0/ \ 
1386/ALL/bash-3.0pl16nb3.tgz 


For OpenBSD, you use the pkg_add -vr command. You may have to adjust 
the FTP path for your version and architecture. Also, there may be a statically 
compiled version. For example: 


pkg_add -vr ftp://ftp.openbsd.org/pub/OpenBSD/3.8/packages/i386/bash- 
3.0.16p1.tgz 


Discussion 


FreeBSD and OpenBSD place bash in /usr/local/bin/bash while NetBSD uses 
/usr/pkg/ bin/bash. 


See Also 

= Recipe 1.11, “Setting bash as Your Default Shell” 

= Recipe 1.12, “Keeping bash Updated” 

= Recipe 15.4, “Testing Scripts Using Virtual Machines” 


1.15 Getting bash for macOS 
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Problem 


You want to get bash for your Mac, or you want to make sure you have the 
latest version. 


Solution 
According to Chet Ramey’s bash page: 


Current versions of Mac OS X [now called macOS] (dating from 
Jaguar/Mac OS X 10.2) ship with bash-3.2 as /bin/sh. There are also 
precompiled OS X packages of bash-4.3 available from many web sites, 
though the source packages are usually more up-to-date. Bash for Darwin 
(the base for MacOS X) is available from MacPorts, Homebrew, or Fink. 


Discussion 


It is also possible to build a more recent version of bash from source, but this 
is recommended only for experienced users (see Appendix E). 


See Also 


= /ttps://tiswww.case.edu/php/chet/bash/bashtop.html#Distributions 
a /Attp://trac.macports.org/browser/trunk/dports/shells/bash 

= /ttp://brew.sh/ 

= /Attp://pdb.finkproject.org/pdb/package.php/bash 

= Recipe 1.12, “Keeping bash Updated” 

m Appendix E 


1. 16 Getting bash for Unix 


Problem 


You want to get bash for your Unix system, or you want to make sure you 
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have the latest version. 


Solution 


If it’s not already installed or in your operating system’s program repository, 
check Chet Ramey’s bash page for binary downloads, or build it from source 
(see Appendix E). 


Discussion 
According to Chet Ramey’s bash page: 


The OpenPKG project makes source RPMs of bash-4.3 available for a 
variety of Unix and Linux systems as a core part of the current release. 


Solaris 2.x, Solaris 7/8/9/10/11 users can get a precompiled version of 
bash-4.3 from the Unixpackages site (subscription) or from OpenCSW. 
Oracle ships bash-3.2 as a supported part of Solaris 10 and bash-4.1 as 
part of Solaris 11. The version of Solaris/Ilumos distributed as 
OpenIndiana includes bash-4.3 as of September 2016. 


AIX users can get precompiled versions of bash-4.3 and older releases for 
various versions of AIX from Groupe Bull, and sources and binaries of 
bash-4.3 for various AIX releases from perzl.org. IBM makes bash-4.2 and 
bash-4.3 available for AIX 5L, AIX 6.1, and AIX 7.1 as part of the AIX 
toolbox for GNU/Linux applications. They use RPM format; you can get 
RPM for AIX from there, too. 


HP-UX users can get bash-4.3 binaries and source code from the Software 
Porting and Archive Center for HP-UX. 


See Also 
a /Attps://tiswww.case.edu/php/chet/bash/bashtop.html#Distributions 
a /Attp://www.openpkg.org/ 
— http://download. openpkg.org/packages/current/source/CORE/ 
— http://www.openpkg.org/download/ 
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Solaris 
— http://www. unixpackages.com/ 
— http://www. opencsw.org/packages/bash/ 


— http://www. oracle.com/technetwork/server- 
storage/solaris 1 0/overview/index.html 


— https://www.oracle.com/solaris/solaris 1 1/index.html 
— http://www. openindiana.org/ 
AIX 


— http://www. bullfreeware.com/, and sources and binaries of bash-4.3 for 
various AIX releases from: 


— http://www. perzl.org/aix/index.php?n=Main.Bash 

— http://www-03.ibm.com/systems/power/software/aix/linux/ 
HP-UX 

— http://hpux.connect.org.uk/hppd/hpux/Shells/ 

Recipe 1.11, “Setting bash as Your Default Shell” 

Recipe 1.12, “Keeping bash Updated” 

Appendix E 


. 17 Getting bash for Windows 


Problem 


You want to get bash for your Windows system, or you want to make sure 
you have the latest version. 


Solution 


Use Cygwin or Ubuntu on Windows, or a virtual machine. Or, don’t use 
bash. 
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Download Cygwin and run it. Follow the prompts and choose the packages to 
install, including bash, which is located in the shells category and is selected 
by default. Once Cygwin is installed, you will have to configure it. See the 
User’s Guide for details. 


For Ubuntu on Windows you need a version of Windows 10 from summer 
2016 or newer; then follow the install instructions, detailed in the Discussion. 


To use a virtual machine, see Recipe 15.4. 
Finally, though we hate to say it, maybe the right solution is to use native 
tools like PowerShell. 


Discussion 


Cygwin 


Cygwin is a Linux-like environment for Windows that provides a Linux look 
and feel. 


From the Cygwin site: 
Cygwin is: 


€ a large collection of GNU and Open Source tools which provide 
functionality similar to a Linux distribution on Windows. 


= a DLL (cygwin1.dll) which provides substantial POSIX API 
functionality. 


Cygwin is not: 


= away to run native Linux apps on Windows. You must rebuild your 
application from source if you want it to run on Windows. 


= a way to magically make native Windows apps aware of UNIX® 
functionality like signals, ptys, etc. Again, you need to build your apps 
from source if you want to take advantage of Cygwin functionality. 


The Cygwin DLL currently works with all recent, commercially released 
x86 32 bit and 64 bit versions of Windows, starting with Windows Vista. 
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NOTE 


The previous Cygwin version 2.5.2 was the last version supporting Windows 
XP and Server 2003. 


Cygwin is a true Unix-like environment running on top of Windows. It is an 
excellent tool, but sometimes it might be overkill. For Windows native 
binaries of the GNU Text utils (not including bash), see 
http://unxutils.sourceforge.net/. 


Ubuntu on Windows 


Running Ubuntu on Windows is very interesting, but aside from the fact that 
it includes bash it is out of scope for this book, so we won’t cover it in detail. 
See the references listed in the See Also section for details. 


Briefly: 
=» Turn on Developer Mode (see hitp://bit.ly/2h21MSZ). 
— Search for “Windows Features.” 


— Choose “Turn Windows features on or off,” and enable “Windows 
Subsystem for Linux.” 


= This probably requires a reboot!!! Seriously! ?! 
= Open the Command Prompt and type in bash. 


— Download the Windows Subsystem for Linux from the Windows store. 


Using PowerShell or other native tools 


PowerShell is Microsoft’s answer to the power and flexibility of scriptable 
command-line tools and the replacement for the command.com and cmd.exe 
batch files. Other than the fact that it is the Windows native answer to shell 
scripting, it is out of scope for this book, so we won’t cover it. 


Though they pale in comparison to any of the Unix/Linux tools, the old shell 
script languages were more powerful than many people knew. They may be 
appropriate for very simple tasks where any of the other solutions discussed 


52 


here are overkill. See Attp://www.jpsdomain.org/windows/winshell. html for 
details. 


For powerful character-based and GUI command-line shells with a more 
consistent interface but a DOS/Windows flavor, see http://jpsoft.com. None 
of the authors are affiliated with this company, but one is a long-time 
satisfied user. 


See Also 


= /Attp://www.cygwin.com 
a Attp://unxutils.sourceforge.net 
=» Ubuntu on Windows: 
— Windows Subsystem for Linux documentation 


— “Microsoft and Canonical Partner to Bring Ubuntu to Windows 10” by 
Steven Vaughan-Nichols 


— “Ubuntu on Windows—The Ubuntu Userspace for Windows 
Developers” by Dustin Kirkland 


— “Developers Can Run Bash Shell and User-Mode Ubuntu Linux 
Binaries on Windows 10” by Scott Hanselman 


— “Announcing Windows 10 Insider Preview Build 14316” by Gabe Aul 


— “alwsl Project Lets You Install Arch Linux in the Windows Subsystem 
for Linux” by Marius Nestor 


— “How to Install and Use the Linux Bash Shell on Windows 10” by 
Chris Hoffman 


= /Attps://en.wikipedia.org/wiki/PowerShell 

a /ttp://jpsoft.com 

a /Attp://www.jpsdomain.org/windows/winshell.html 
= Recipe 1.12, “Keeping bash Updated” 

m Recipe 1.18, “Getting bash Without Getting bash” 
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= Recipe 15.4, “Testing Scripts Using Virtual Machines” 


1.18 Getting bash Without Getting bash 


Problem 


You want to try out a shell or a shell script on a system you don’t have the 
time or the resources to build or buy. 


Or, you feel like reading a Zen-like recipe just about now. 


Solution 


Get an almost free shell account from /ttp://polarhome.com, which has a 
tiny, symbolic one-time fee, or another vendor. 


Since almost every Linux and BSD distribution has a LiveCD or LiveDVD 
image, which can also almost certainly be used as a LiveUSB, you can 
download and boot those to experiment. That’s also a good idea if you are 
thinking about switching, so you can verify that all your hardware is 
supported and works. The tricky part may be getting your system’s BIOS or 
UEFI to boot from the CD/DVD or USB. It used to be tricky to “burn” an 
ISO to a USB stick, but there are now many tools and detailed instructions on 
the web for your distro of choice. 


Alternatively, you can use a virtualization solution; see Recipe 15.4. 


Discussion 


Polarhome provides many free services and almost free shell accounts. 
According to the website: 


polarhome.com is a non commercial, educational effort for popularization 
of shell enabled operating systems and Internet services, offering shell 
accounts, development environment, mail and other online services on all 
available systems (currently on different flavours of Linux, MacOS X, 
OpenVMS, Solaris, OpenIndiana, AIX, QNX, IRIX, HP-UX, Tru64, SCO 
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OpenServer, UnixWare, FreeBSD, OpenBSD, NetBSD, DragonFly/BSD, 
MirBSD, Ultrix, Minix, GNU Hurd, Syllable and OPENSTEP). 


See Also 
= List of free shell accounts: http://shells.red-pill.eu 


a /Attp://www.polarhome.com 


= Recipe 15.4, “Testing Scripts Using Virtual Machines” 


1.19 Learning More About bash Documentation 


Problem 


You’d like to read more about bash but don’t know where to start. 


Solution 


Well, you’re reading this book, which is a great place to start! The other 
O’Reilly books about bash and shell scripting are Learning the bash Shell, 
3rd Edition, by Cameron Newham and Classic Shell Scripting by Nelson H. 
F. Beebe and Arnold Robbins. 


Unfortunately, not all of the official bash documentation and support files are 
easily accessible online. The Bash Reference Manual is available on the GNU 
Project website, but much of the other material is harder to find or access. 
Our companion website has done all the work for you and provides the 
official bash reference documentation and other useful material online so it’s 
easy to refer to. Check it out, and refer others to it as needed. 


Official documentation 


The official bash FAQ is at fip://ftp.cwru.edu/pub/bash/FAQ. See especially 
“H2) What kind of bash documentation is there?” The official reference 
guide is also strongly recommended; see below for details. 


Chet Ramey’s bash page contains a ton of very useful information. Chet (the 
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current bash maintainer) also maintains the following: 
README 
A file describing bash 


NEWS 


A file tersely listing the notable changes between the current and previous 
versions 


CHANGES 
A complete bash change history 


INSTALL 


Installation instructions 


NOTES 


Platform-specific configuration and operation notes 


COMPAT 
A list of compatibility issues between bash3 and bash! 


The latest bash source code and documentation are always available at 
http://ftp.gnu.org/gnu/bash/. 


We highly recommend downloading both the source and the documentation 
even if you are using prepackaged binaries (see Appendix B for an index of 
the included examples and source code). Here is a brief list of the 
documentation included in the source tarball’s ./doc directory (for example, 
for http://ftp.gnu.org/gnu/bash/bash-4.4.tar.gz/, bash-4.4/doc): 


FAQ 


A set of frequently asked questions about bash with answers 


INTRO 


A short introduction to bash 


article.ms 
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An article Chet wrote about bash for The Linux Journal 


bash. 1 
The bash manpage 


bashbug. 1 
The bashbug manpage 


builtins. 1 


A manpage that documents the builtins extracted from bash. 1 


bashref. texi 
The Bash Reference Manual 


bashref.info 
The Bash Reference Manual processed by makeinfo 


rbash.1 
The restricted bash shell manpage 


readline. 3 
The readline manpage 


The .ps files are PostScript versions of the files listed here. The .Atm/ files are 
HTML versions of the manpage and reference manual. The .0 files are 
formatted manual pages. The .txt versions are ASCII—the output of groff - 
Tascit. 


In the document tarball (for example, http://ftp.gnu.org/gnu/bash/bash-doc- 
4.4.tar.gz), you will find formatted versions of: 


bash-doc-4.4: 
bash.0 
The bash manpage (also .pdf, .ps, .html) 


bashbug.0 
The bashbug manpage 
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bashref 

The GNU Bash Reference Manual (also .pdf, .ps, .html, .dvi) 
builtins.0 

The builtins manpage 
rbash.0 

The restricted bash shell manpage 


Other documentation 

= Mendel Cooper’s “Advanced Bash-Scripting Guide” 

a “Writing Shell Scripts” 

a Mike G’s “BASH Programming — Introduction HOW-TO” 
= Machtelt Garrels’s “Bash Guide for Beginners” 

= Giles Orr’s “Bash Prompt HOWTO” 


= Very old, but still useful: “UNIX shell differences and how to change your 
shell” 


a Apple’s “Shell Scripting Primer” 


See Also 
= Appendix B 
= Learning the bash Shell, 3rd Edition, by Cameron Newham (O’ Reilly) 


= Classic Shell Scripting by Nelson H. F. Beebe and Arnold Robbins 
(O’ Reilly) 


= The Bash Reference Manual 

a /Attp://www.bashcookbook.com/ 

a ftp://ftp.cwru.edu/pub/bash/FAQ 

= /Attps://www.case.edu/php/chet/bash/bashtop.html 
a Attp://ftp.gnu.org/gnu/bash/ 
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Chapter 2. Standard Output 


No software is worth anything if there is no output of some sort, but I/O has 
long been one of the nastier areas of computing. If you’re ancient, you 
remember the days when most of the work involved in running a program 
was setting up the program’s input and output. Some of the problems have 
gone away; for example, you no longer need to get operators to mount tapes 
on a tape drive (at least, not on any laptop or desktop system that we’ve 
seen!). But many of the difficulties are still with us. 


One problem is that there are many different types of output. Writing 
something on the screen is different from writing something in a file—at 
least, it sure seems different. Writing something in a file also seems different 
from writing it on a tape, or in flash memory, or on some other kind of 
device. And what if you want the output from one program to go directly into 
another program? Should software developers be tasked with writing code to 
handle all sorts of output devices, even ones that haven’t been invented yet? 
That’s certainly inconvenient. Should users have to know how to connect the 
programs they want to run to different kinds of devices? That’s not a very 
good idea, either. 


One of the most important ideas behind the Unix operating system was that 
everything looked like a file (an ordered sequence of bytes). The operating 
system was responsible for this magic. It didn’t matter whether you were 
writing to a file on the disk, the terminal, a tape drive, a memory stick, or 
something else; your program only needed to know how to write to a file, and 
the operating system would take it from there. That approach greatly 
simplified the problem. 


The next question was simply, “Which file?” How does a program know 
whether to write to the file that represents a terminal window, a file on the 
disk, or some other kind of file? Simple: that’s something that can be left to 
the shell. 
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When you run a program, you still have to connect it to output files and input 
files (which we’ll explore in the next chapter). That task doesn’t go away, but 
the shell makes it trivially easy. A command as simple as: 


dosomething < inputfile > outputfile 


reads its input from tnputfile and sends its output to outputfi le. If you 
omit the > outputfile, the output goes to your terminal window. If you 
omit the < inputfile, the program takes its input from the keyboard. The 
program literally doesn’t know where its output is going, or where its input is 
coming from. You can send the output anywhere you want (including to 
another program) by using bash’s redirection facilities. 


But that’s just the start. In this chapter, we’ll look at ways to generate output, 
and the shell’s methods for sending that output to different places. 


2.1 Writing Output to the Terminal /Window 


Problem 


You want some simple output from your shell commands. 


Solution 


Use the echo builtin command. All the parameters on the command line are 
printed to the screen. For example: 


echo Please wait. 
produces: 
Please wait. 


as we see in this simple session where we typed the command at the bash 
prompt (the $ character): 
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$ echo Please wait. 
Please wait. 


$ 


Discussion 


The echo command is one of the simplest of all bash commands. It prints the 
arguments of the command line to the screen. But there are a few points to 
keep in mind. First, the shell is parsing the arguments on the echo command 
line (like it does for every other command line). This means that it does all its 
substitutions, wildcard matching, and other things before handing the 
arguments off to echo. Second, since they are parsed as arguments, the 
spacing between arguments is ignored. For example: 


$ echo this was very widely spaced 
this was very widely spaced 
$ 


Normally the fact that the shell is very forgiving about whitespace between 
arguments is a helpful feature. Here, with echo, it’s a bit disconcerting (see 
Recipe 2.2 for tips on preserving whitespace in output and Recipe 13.15 for 
tips on trimming it from your data). 


See Also 


m help echo 

m help printf 

m Recipe 2.2, “Writing Output but Preserving Spacing” 

= Recipe 2.3, “Writing Output with More Formatting Control” 
= Recipe 13.15, “Trimming Whitespace” 

= Recipe 15.6, “Using echo Portably” 

= Recipe 19.1, “Forgetting to Set Execute Permissions” 


= “echo Options and Escape Sequences” in Appendix A 
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a “printf” in Appendix A 


2.2 Writing Output but Preserving Spacing 


Problem 


You want the output to preserve your spacing. 


Solution 


Enclose the string in quotes. The previous example, but with quotes added, 
will preserve our spacing: 


$ echo "this was very widely spaced" 
this was very widely spaced 


$ 
or: 


$ echo 'this was very widely spaced' 


this was very widely spaced 
$ 
Discussion 


Since the words are enclosed in quotes, they form a single argument to the 
echo command. That argument is a string, and the shell doesn’t need to 
interfere with the contents of the string. In fact, by using single quotes (' ') 
you explicitly tell the shell not to interfere with the string at all. If you use 
double quotes (""), some shell substitutions do take place (variable, 
arithmetic, and tilde expansions and command substitutions), but since we 
have none in this example, the shell has nothing to change. When in doubt, 
use the single quotes. 


See Also 
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m help echo 

=» help printf 

= Chapter 5 for more information about substitution 

= Recipe 2.3, “Writing Output with More Formatting Control” 
= Recipe 15.6, “Using echo Portably” 

= Recipe 19.11, “Seeing Odd Behavior from printf” 


= “echo Options and Escape Sequences” in Appendix A 


2.3 Writing Output with More Formatting 
Control 


Problem 


You want more control over the formatting and placement of output. 


Solution 
Use the printf builtin command. 


For example: 


$ printf '%s = %d\n' Lines SLINES 
Lines = 24 


$ 
or: 
$ printf '%-10.10s = %4.2f\n' 'Gigahertz' 1.92735 


Gigahertz = 1.93 
$ 


Discussion 
The printf builtin command behaves like the C language library call, where 
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the first argument is the format control string and the successive arguments 
are formatted according to the format specifications (%). 


The numbers between the % and the format type (s or f in our example) 
provide additional formatting details. For the floating-point type (f), the first 
number (4 in the 4.2 specifier) is the width of the entire field. The second 
number (2) is how many digits should be printed to the right of the decimal 
point. Note that it rounds the answer. 


For a string, the first number is the maximum field width, and the second is 
the number of bytes to be printed. The string will be truncated (if longer than 
max) or blank padded (if less than min) as needed. When the max and min 
specifiers are the same, then the string is guaranteed to be that length. The 
negative sign on the specifier means to left-align the string (within its field 
width). Without the minus sign, the string would right-justify, thus: 


$ printf '%10.10s = %4.2f\n' 'Gigahertz' 1.92735 
Gigahertz = 1.93 
$ 


The string argument can either be quoted or unquoted. Use quotes if you need 
to preserve embedded spacing (there were no spaces needed in our one-word 
strings), or if you need to escape the special meaning of any special 
characters in the string (again, our example had none). It’s a good idea to be 
in the habit of quoting any string that you pass to printf, so that you don’t 
forget the quotes when you need them. 


See Also 


m help printf 
€ http:/pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html 


= Learning the bash Shell, 3rd Edition, by Cameron Newham (O’Reilly), 
page 171, or any C reference on its printf function 


= Recipe 15.6, “Using echo Portably” 


64 


= Recipe 19.11, “Seeing Odd Behavior from printf” 
a “printf” in Appendix A 


2.4 Writing Output Without the Newline 


Problem 


You want to produce some output without the default newline that echo 
provides. 


Solution 


Using printf it’s easy—tjust leave off the ending \n in your format string: 


$ printf "%s %s" next prompt 
next prompts 


With echo, use the -n option: 


$ echo -n prompt 
prompts 


Discussion 


Since there was no newline at the end of the printf format string (the first 
argument), the prompt character ($) appears right where the printf left off. 
This feature is much more useful in shell scripts where you may want to do 
partial output across several statements before completing the line, or where 
you want to display a prompt to the user before reading input. 


With the echo command (see Recipe 15.6), there are two ways to eliminate 
the newline. First, the -n option suppresses the trailing newline. The echo 
command also has several escape sequences with special meanings similar to 
those in C language strings (e.g., \n for newline). To use these escape 
sequences, you must invoke echo with the -e option. One of echo’s escape 


65 


sequences is \c, which doesn’t print a character, but rather inhibits printing 
the ending newline. Thus, here’s a third solution: 


$ echo -e 'hi\c' 
his 


Because of the powerful and flexible formatting that printf provides, and 
because it is a builtin with very little overhead to invoke (unlike in other 
shells or older versions of bash, where printf was a standalone executable), 
we will use printf for many of our examples throughout the book. 


See Also 


m help echo 

=» help printf 

= /Attp://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf. html 
= Chapter 3, particularly Recipe 3.5, “Getting User Input” 

= Recipe 2.3, “Writing Output with More Formatting Control” 

= Recipe 15.6, “Using echo Portably” 

= Recipe 19.11, “Seeing Odd Behavior from printf” 

= “echo Options and Escape Sequences” in Appendix A 


a “printf” in Appendix A 


2.5 Saving Output from a Command 


Problem 


You want to keep the output from a command by putting it in a file. 


Solution 
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Use the > symbol to tell the shell to redirect the output into a file. For 
example: 


$ echo fill it up 

fill it up 

$ echo fill it up > file.txt 
$ 


Just to be sure, let’s look at what is inside file.txt to see if it captured our 
output: 


$ cat file.txt 
fill it up 
$ 


Discussion 


The first line of the first part of the example shows an echo command with 
three arguments that are printed out. The second line uses the > to capture 
that output into a file named file.txt, which is why no output appears after that 
echo command. 


The second part of the example uses cat to display the contents of the file. 
We can see that the file contains what echo would have otherwise sent as 
output. 


The cat command gets its name from the longer word concatenation. The cat 
command concatenates the output from the files listed on its command line, 
so if you enter cat file1 filetwo anotherfile morefiles the contents 
of those files will be sent, one after another, to the terminal window. If a large 
file has been split in half, you can also use cat to glue it back together (1.e., 
concatenate the two halves) by capturing the output into a third file: 


cat first.half second.half > whole.file 


So our simple command, cat file.txt, is really just the trivial case of 
concatenating only one file, with the result sent to the screen. That is to say, 
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while cat is capable of more, its primary use in this example is to dump the 
contents of a file to the screen. 


See Also 


=m man cat 


= Recipe 17.23, “Numbering Lines” 


2.6 Saving Output to Other Files 


Problem 


You want to save the output with a redirect to elsewhere in the filesystem, not 
in the current directory. 


Solution 


Use more of a pathname when you redirect the output: 
echo some more data > /tmp/echo.out 
or: 


echo some more data > ../../over.here 


Discussion 


The filename that appears after the redirection character (the >) is actually a 
path-name. If it begins with no other qualifiers, the file will be placed in the 
current directory. 


If that filename begins with a slash (^) then it is an absolute pathname, and 

output will be placed where it specifies in the filesystem hierarchy (1.e., tree), 
beginning at the root (provided all the intermediary directories exist and have 
permissions that allow you to traverse them). We used /tmp since it is a well- 
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known, universally available scratch directory on virtually all Unix systems. 
The shell, in this example, will create the file named echo.out in the /tmp 
directory. 


Our second example, placing the output into .././over.here, uses a relative 
pathname, and the .. is the specially named directory inside every directory 
that refers to the parent directory. So, each reference to .. moves up a level in 
the filesystem tree (toward the root, not what we usually mean by “up” in a 
tree). The point here is that we can redirect our output, if we want, into a file 
that is far away from where we are running the command. 


See Also 


= Learning the bash Shell, 3rd Edition, by Cameron Newham (O’ Reilly), 
pages 7—10, for an introduction to files, directories, and the dot notation 
(i.e.,. and ..) 


2.7 Saving Output from the ls Command 


Problem 


You tried to save output from the /s command with a redirect, but when you 
look at the resulting file, the format is not what you expected. 


Solution 


Use the -C option on /s when you redirect the output. 


Here’s the /s command showing the contents of a directory: 


$ ls 
a.out cong.txt def.conf file.txt more.txt zebra.list 


$ 


But when we save the output with the > to redirect it to a file, and then show 
the file contents, we get one file per line, like this: 
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$ ls > /tmp/save.out 
$ cat /tmp/save.out 
a.out 

cong.txt 

def .conf 

file.txt 

more. txt 

zebra. list 


$ 
This time we’ll use the -C option: 


$ ls -C > /tmp/save.out 
$ cat /tmp/save.out 
a.out cong.txt def.conf file.txt more.txt zebra. list 


$ 


Alternatively, if we use the -1 option on /s when we don’t redirect, we get 
output like this: 


$ ls -1 
a.out 
Cong. txt 
def .conf 
file.txt 
more.txt 
zebra. list 


$ 


The original attempt at redirection matches this output. 


Discussion 


Just when you thought that you understood redirection and you tried it on a 
simple /s command, it didn’t quite work right. What’s going on here? 


The shell’s redirection is meant to be transparent to all programs, so 
programs don’t need special code to make their output redirectable. The shell 
takes care of it when you use the > to send the output elsewhere. But it turns 
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out that code can be added to a program to figure out when its output is a 
terminal (see man isatty). Then, the program can behave differently in 
those two cases—and that’s what /s is doing. 


The authors of /s figured that if your output is going to the screen, then you 
probably want columnar output (the -C option), as screen real estate is 
limited. But they assumed if you’re redirecting it to a file, then you’ll want 
one file per line (the -1 option) since there are more interesting things you 
can do (1.e., other processing) that is easier if each filename is on a line by 
itself. 


See Also 


m man ls 
m man isatty 


= Recipe 2.6, “Saving Output to Other Files” 


2.8 Sending Output and Error Messages to 
Different Files 


Problem 


You are expecting output from a program, but you don’t want it to get littered 
with error messages. You’d like to save your error messages, but it’s harder 
to find them mixed among the expected output. 


Solution 


Redirect output and error messages to different files: 
myprogram 1> messages.out 2> message.err 


or more commonly: 
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myprogram > messages.out 2> message.err 


Discussion 


This example shows two different output files created by the shell. The first, 
messages.out, will get all the output from the hypothetical myprogram 
redirected into it. Any error messages from myprogram will be redirected 
into message.err. 


In the constructs 1> and 2> the number is the file descriptor. 1 is standard 
output (STDOUT) and 2 is standard error (STDERR). Numbering starts at 0, 
for standard input (STDIN). When no number is specified, STDOUT is 
assumed. For more information on file descriptors and the difference between 
STDOUT and STDERR, see Recipe 2.19. 


See Also 

= Recipe 2.6, “Saving Output to Other Files” 

= Recipe 2.13, “Throwing Output Away” 

m Recipe 2.19, “Saving Output When Redirect Doesn’t Seem to Work” 


2.9 Sending Output and Error Messages to the 
Same File 


Problem 


Using redirection, you can redirect output or error messages to separate files, 
but how do you capture all the output and error messages to a single file? 


Solution 


Use the shell syntax to redirect standard error messages to the same place as 
standard output. 


Preferred: 
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both >& outfile 

or: 
both &> outfile 

or older and slightly more verbose (but also more portable): 
both > outfile 2>&1 


where both is just our (imaginary) program that is going to generate output to 
both STDERR and STDOUT. 


Discussion 


&> and >& are shortcuts that simply send both STDOUT and STDERR to the 
same place—exactly what we want to do. 


In the third example, the 1 appears to be used as the target of the redirection, 
but the >& says to interpret the 1 as a file descriptor instead of a filename. In 
fact, the 2>&1 is a single entity—no spaces allowed—indicating that standard 
error (2) will be redirected (>) to a file descriptor (&) that follows (1). The 2>& 
all has to appear together without spaces; otherwise the 2 would look just like 
another argument, and the & actually means something completely different 
when it appears by itself. (It has to do with running the command in the 
background.) 


It may help to think of all redirection operators as taking a leading number 
(e.g., 2>), but that the default number for > is 1, the standard output file 
descriptor. 


You could also do the redirection in the other order, though it is slightly less 
read-able, and redirect standard output to the same place to which you have 
already redirected standard error (in fact you must do it this way if you are 
using a pipe; see Recipe 2.15): 
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both 2> outfile 1>&2 


The 1 indicates standard output and the 2 standard error. We could have 
written just >&2 for that last redirection, since 1 is the default for >, but we 
find it more readable to write the number explicitly when redirecting file 
descriptors. 


NOTE 


Note the order of the contents of the output file. Sometimes the error messages 
may appear sooner in the file than they do on the screen. That has to do with the 
unbuffered nature of standard error, and the effect becomes more pronounced 
when writing to a file instead of the screen. 


See Also 


= Recipe 2.6, “Saving Output to Other Files” 
= Recipe 2.13, “Throwing Output Away” 


2.10 Appending Rather than Clobbering Output 


Problem 


Each time you redirect your output, it creates that output file anew. What if 
you want to redirect output a second (or third, or...) time, and don’t want to 
clobber the previous output? 


Solution 


The double greater-than sign (>>) is a bash redirector that means append the 
output: 


$ ls > /tmp/ls.out 
$ cd ../elsewhere 
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$ ls >> /tmp/ls.out 
$ cd ../anotherdir 
$ ls >> /tmp/ls.out 
$ 


Discussion 


The first line includes a redirect that truncates the file if it exists and starts 
with a clean (empty) file, filling it with the output from the ¿s command. 


The second and third invocations of /s use the double greater-than sign (>>) 
to indicate appending to, rather than replacing the contents of, the output file. 


If you want to have error messages (1.e., STDERR) included in the 
redirection, specify that redirection after redirecting STDERR, like this: 


ls >> /tmp/ls.out 2>&1 
As of bash version 4 you can combine both of those redirections in one: 
ls &> /tmp/ls.out 


which will redirect both STDERR and STDOUT and append them to the 
specified file. Just remember that the ampersand must come first and no 
spacing is allowed between the three characters. 


See Also 


= Recipe 2.6, “Saving Output to Other Files” 
= Recipe 2.13, “Throwing Output Away” 


2.11 Using Just the Beginning or End of a File 


Problem 


You need to display or use just the beginning or end of a file. 
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Solution 


Use the head or tail command. By default, head will output the first 10 lines 
and tail will output the last 10 lines of the given file. If more than one file is 
given, the appropriate lines from each of them are output. Use the -number 
switch (e.g., -5) to change the number of lines. tail also has the -f and -F 
switches, which follow the end of the file as it is written to, and it has an 
interesting + switch that we cover in Recipe 2.12. 


Discussion 


head and tail, along with cat, grep, sort, cut, and unig, are some of the most 
commonly used Unix text processing tools out there. If you aren’t already 
familiar with them, you’!l soon wonder how you ever got along without 
them. 


See Also 

= Recipe 2.12, “Skipping a Header in a File” 

= Recipe 7.1, “Sifting Through Files for a String” 
= Recipe 8.1, “Sorting Your Output” 

= Recipe 8.4, “Cutting Out Parts of Your Output” 
= Recipe 8.5, “Removing Duplicate Lines” 


= Recipe 17.23, “Numbering Lines” 


2.12 Skipping a Header in a File 
Problem 


You have a file with one or more header lines and you need to process just 
the data, and skip the header. 
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Solution 


Use the tail command with a special argument. For example, to skip the first 
line of a file: 


$ tail -n +2 lines 
Line 2 
Line 3 
Line 4 
Line 5 


$ 


Discussion 


An argument to tail, of the format -n number (or just - number), will specify 
a line offset relative to the end of the file. So, tail -n 10 file shows the 
last 10 lines of file, which also happens to be the default if you don’t 
specify anything. Specifying a number starting with a plus sign (+) indicates 
an offset relative to the top of the file. Thus, tail -n +1 file gives you the 
entire file, tail -n +2 skips the first line, and so on. 


See Also 


m man tail 


= Recipe 13.12, “Setting Up a Database with MySQL” 


2.13 Throwing Output Away 


Problem 


Sometimes you don’t want to save the output into a file; in fact, sometimes 
you don’t even want to see it at all. 


Solution 


T1 


Redirect the output to /dev/null as shown in these examples: 
find / -name myfile -print 2> /dev/null 

or: 
noisy > /dev/null 2>&1 


Discussion 


You could redirect the unwanted output into a file, then remove the file when 
you’re done. But there is an easier way. Unix and Linux systems have a 
special device that isn’t real hardware at all, just a bit bucket where we can 
dump unwanted data. It’s called /dev/null and is perfect for these situations. 
Any data written there is simply thrown away, so it takes up no disk space. 
Redirection makes it easy. 


In the first example, only the output going to standard error is thrown away. 
In the second example, both standard output and standard error are discarded. 


In rare cases, you may find yourself in a situation where /dev is on a read- 
only filesystem (for example, certain information security appliances), in 
which case you are stuck with the first suggestion of writing to a file and then 
removing it. 


See Also 
= Recipe 2.6, “Saving Output to Other Files” 


2.14 Saving or Grouping Output from Several 
Commands 


Problem 


You want to capture the output with a redirect, but you’re typing several 
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commands on one line: 
pwd; ls; cd ../elsewhere; pwd; ls > /tmp/all.out 


The final redirect applies only to the last command, the last /s on that line. All 
the other output appears on the screen (1.e., does not get redirected). 


Solution 
Use braces ({ }) to group these commands together; then redirection applies 


to the output from all commands in the group. For example: 


{ pwd; ls; cd ../elsewhere; pwd; ls; } > /tmp/all.out 


WARNING 


There are two very subtle catches here. The braces are actually reserved words, 
so they must be surrounded by whitespace. Also, the trailing semicolon is 
| required before the closing brace. 


Alternatively, you could use parentheses, (), to tell bash to run the 
commands in a subshell, then redirect the output of the entire subshell’s 
execution. For example: 


(pwd; ls; cd ../elsewhere; pwd; ls) > /tmp/all.out 


Discussion 


While these two solutions look very similar, there are two important 
differences. The first difference is syntactic, the second semantic. 
Syntactically, the braces need to have whitespace around them, and the last 
command inside the list must terminate with a semicolon. That’s not required 
when you use parentheses. The bigger difference, though, is semantic—what 
these constructs mean. The braces are just a way to group several commands 
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together, more like a shorthand for our redirecting, so that we don’t have to 
redirect each command separately. Commands enclosed in parentheses, 
however, run in another instance of the shell, a child of the current shell 
called a subshell. 


The subshell is almost identical to the current shell’s environment—1.e., 
variables, including $PATH, are all the same, but traps are handled differently 
(for more on traps, see Recipe 10.6). Now here is the big difference in using 
the subshell approach: because a subshell is used to execute the cd 
commands, when the subshell exits, your main shell remains where it started. 
That is, its current directory hasn’t moved, and its variables haven’t changed. 


With the braces used for grouping, you end up in the new directory 
(../elsewhere in our example). Any other changes that you make (variable 
assignments, for example) will be made to your current shell instance. While 
both approaches result in the same output, they leave you in very different 
places. 


One interesting thing you can do with braces is form more concise branching 
blocks (Recipe 6.2). You can shorten this: 


if [ $result = 1 ]; then 
echo "Result is 1; excellent." 
exit 0 

else 
echo "Uh-oh, ummm, RUN AWAY! " 
exit 120 

Ti 


into this: 


[ $result = 1 ] \ 
&& { echo "Result is 1; excellent." ; exit 0; } \ 
|| { echo "Uh-oh, ummm, RUN AWAY! " ; exit 120; } 


How you write it depends on your style and what you think is readable, but 
we recommend the first form because it is clearer to a wider audience. 
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See Also 

= Recipe 6.2, “Branching on Conditions” 

= Recipe 10.6, “Trapping Interrupts” 

= Recipe 15.11, “Getting Input from Another Machine” 

= Recipe 19.5, “Expecting to Change Exported Variables” 

= Recipe 19.8, “Forgetting that Pipelines Make Subshells” 

= “Builtin Shell Variables” in Appendix A to learn about SBASH_SUBSHELL 


2.15 Connecting Two Programs by Using Output 
as Input 


Problem 


You want to take the output from one program and use it as the input of 
another program. 


Solution 


You could redirect the output from the first program into a temporary file, 
then use that file as input to the second program. For example: 


$ cat one.file another.file > /tmp/cat.out 
$ sort < /tmp/cat.out 


a /tmp/cat.out 
$ 


Or you could do all of that in one step, sending the output directly to the next 
program, by using the pipe symbol (|) to connect them. For example: 


cat one.file another.file | sort 
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You can also link a sequence of several commands together by using multiple 
pipes: 


cat my* | tr 'a-z' 'A-Z' | sort | uniq | awk -f transform.awk | wc 


Discussion 


Using the pipe symbol means we don’t have to invent a temporary filename, 
remember it, and remember to delete it. 


Programs like sort can take input from standard input (redirected via the < 
symbol), but they can also take input as a filename. So, you can do this: 


sort /tmp/cat.out 
rather than redirecting the input into sort: 
sort < /tmp/cat.out 


That behavior (of using a filename if supplied, and if not, of using standard 
input) is a typical Unix/Linux characteristic, and a useful model to follow so 
that commands can be connected one to another via the pipe mechanism. 
Such programs are called filters, and if you write your programs and shell 
scripts that way, they will be more useful to you and to those with whom you 
share your work. 


Feel free to be amazed at the powerful simplicity of the pipe mechanism. You 
can even think of the pipe as a rudimentary parallel processing mechanism. 
You have two commands (programs) running in parallel, sharing data—the 
output of one as the input to the next. They don’t have to run sequentially 
(where the first runs to completion before the second one starts); the second 
one can get started as soon as data is available from the first. 


Be aware, however, that commands run this way (1.e., connected by pipes) 
are run in Separate processes. While such a subtlety can often be ignored, 
there are a few times when the implications of this are important. We’ll 
discuss that in Recipe 19.8. 
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Also consider a command such as svn -v log | less. If Jess exits before 
Subversion has finished sending data, you’ll get an error like svn: Write 
error: Broken pipe. While it isn’t pretty, it also isn’t harmful. It happens 
all the time when you pipe a voluminous amount of data into a program like 
less—you often want to quit once you’ve found what you’re looking for, 
even if there is more data coming down the pipe. 


See Also 


= Recipe 3.1, “Getting Input from a File” 
= Recipe 19.8, “Forgetting that Pipelines Make Subshells” 


2.16 Saving a Copy of Output Even While Using 
It as Input 


Problem 


You want to debug a long sequence of piped I/O, such as: 
cat my* | tr 'a-z' 'A-Z' | uniq | awk -f transform.awk | wc 


How can you see what is happening between uniq and awk without disrupting 
the pipe? 


Solution 


The solution to these problems is to use what plumbers call a T-joint in the 
pipes. For bash, that means using the tee command to split the output into 
two identical streams, one that is written to a file and the other that is written 
to standard output, so as to continue the sending of data along the pipes. 


For this example where we’d like to debug a long string of pipes, we insert 
the tee command between uniq and awk: 
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. uniq | tee /tmp/x.x | awk -f transform.awk ... 


Discussion 


The tee command writes the output to the filename(s) specified as its 
parameter and also writes that same output to standard out. In our example, it 
sends a copy to /tmp/x.x and also sends the same data to awk, the command to 
which the output of tee is connected via the pipe symbol. 


Don’t worry about what each different piece of the command line is doing in 
these examples; we just want to illustrate how fee can be used in any 
sequence of commands. 


Let’s back up just a bit and start with a simpler command line. Suppose you’d 
just like to save the output from a long-running command for later reference, 
while at the same time seeing it on the screen. After all, a command like: 


find / -name '*.c' -print | less 
could find a lot of C source files, so the output will likely scroll off the 
window. Using more or less will let you look at the output in manageable 
pieces, but once completed they don’t let you go back and look at that output 
without rerunning the command. Sure, you could run the command and save 
it to a file: 


find / -name '*.c' -print > /tmp/all.my.sources 


but then you have to wait for it to complete before you can see the contents of 
the file. (OK, we know about tail -f, but that’s just getting off-topic here.) 
The tee command can be used instead of the simple redirection of standard 
output: 


find / -name '*.c' -print | tee /tmp/all.my.sources 


In this example, since the output of tee isn’t redirected anywhere, it will print 
to the screen. But the copy that is diverted into a file will also be there for 


84 


later use (e.g., cat /tmp/all.my.sources). 


Notice, too, that in these examples we did not redirect standard error at all. 
This means that any errors, like you might expect from find, will be printed to 
the screen but won’t show up in the tee file. We could add a 2>&1 to the find 
command: 


find / -name '*.c' -print 2>&1 | tee /tmp/all.my.sources 


to include the error output in the fee file. It won’t be neatly separated, but it 
will be captured. 


See Also 


m man tee 
= Recipe 18.5, “Reusing Arguments” 


= Recipe 19.13, “Debugging Scripts” 


2.17 Connecting Two Programs by Using Output 
as Arguments 


Problem 


What if one of the programs to which you would like to connect with a pipe 

doesn’t work that way? For example, you can remove files with the rm 

command, specifying the files to be removed as parameters to the command: 
rm my.java your.c their.* 

But rm doesn’t read from standard input, so you can’t do something like: 
find . -name '*.c' | rm 


Since rm only takes its filenames as arguments or parameters on the 
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command line, how can we get the output of a previously run command (e.g., 
echo or ls) onto the command line? 


Solution 


Use the command substitution feature of bash: 
rm $(find . -name '*.class') 


You can also use the xarg command; see the discussion in Recipe 15.13. 


Discussion 


The $() encloses a command that is run in a subshell. The output from that 
command is substituted in place of the $() phrase. Newlines cause the output 
to become several parameters on the command line, which is often useful but 
may sometimes be surprising. 


The earlier shell syntax was to use backquotes (` `) instead of $() for 
enclosing the sub-command. The $() syntax is preferred over the older `` 
syntax because it is easier to nest and arguably easier to read. However, you 
may see `` more often than $(), especially in older scripts or from those who 
grew up with the original Bourne or C shells. 


In our example, the output from find—typically a list of names—will become 
the arguments to the rm command. 


WARNING 
Be very careful when doing something like this because rm is very unforgiving. 
If your find command finds more than you expect, rm will remove it with no 
recourse. This is not Windows; you cannot recover deleted files from the 
recycle bin. You can mitigate the danger with rm -i, which will prompt you to 
verify each deletion. That’s OK on a small number of files, but interminable on 
a large set. 


One way to use such a mechanism in bash with greater safety is to run that 
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inner command first by itself. When you can see that you are getting the results 
that you want, only then do you use it in the command with $(). 


For example: 


$ find . -name '*.class' 
First.class 

Other .class 

$ rm $(find . -name '*.class') 


$ 


We’ll see in an upcoming recipe how this can be made even more foolproof by 
using !! instead of retyping the find command (see Recipe 18.2). 


See Also 


= Recipe 9.8, “Finding Files by Size” 
= Recipe 18.2, “Repeating the Last Command” 
m Recipe 15.13, “Working Around “Argument list too long” Errors” 


2.18 Using Multiple Redirects on One Line 


Problem 


You want to redirect output to several different places. 


Solution 


Use redirection with file numbers to open all the files that you want to use. 
For example: 


divert 3> file.three 4> file.four 5> file.five 6> else.where 


where divert might be a shell script with various commands whose output 
you want to send to different places. For example, you might write divert to 
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contain lines like this: echo option SOPTSTR >&5. That is, your divert shell 
script could direct its output to various different descriptors, which the 
invoking program can send to different destinations. 


Similarly, if divert was a C program executable, you could actually write to 
descriptors 3, 4, 5, and 6 without any need for open() calls. 


Discussion 


In Recipe 2.8 we explained that each file descriptor is indicated by a number, 
starting at zero: standard input is 0, standard output is 1, and standard error is 
2. If no number is given, 1 is assumed. That means that you could redirect 
standard output with the slightly more verbose 1> (rather than a simple >) 
followed by a filename, but there’s no need; the shorthand > is fine. It also 
means that you can have the shell open up any number of arbitrary file 
descriptors and have them set to write various files so that the program that 
the shell then invokes from the command line can use these opened file 
descriptors without further ado. 


While we don’t recommend this technique because it’s fragile and more 
complicated than it needs to be, it is intriguing. 


See Also 


= Recipe 2.6, “Saving Output to Other Files” 
= Recipe 2.8, “Sending Output and Error Messages to Different Files” 
= Recipe 2.13, “Throwing Output Away” 


2.19 Saving Output When Redirect Doesn’ t Seem 
to Work 


Problem 


You tried using > but some (or all) of the output still appears on the screen. 
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For example, the compiler was producing these error messages: 


$ gcc bad.c 
bad.c: In function ‘main': 


bad.c:3: error: ~bad' undeclared (first use in this function) 
bad.c:3: error: (Each undeclared identifier is reported only once 
bad.c:3: error: for each function it appears in.) 

bad.c:3: error: parse error before "c" 

$ 


You wanted to capture those messages, so you tried redirecting the output: 


$ gcc bad.c > save.it 
bad.c: In function ‘main': 


bad.c:3: error: `bad' undeclared (first use in this function) 
bad.c:3: error: (Each undeclared identifier is reported only once 
bad.c:3: error: for each function it appears in.) 

bad.c:3: error: parse error before "c" 

$ 


However, it doesn’t seem to have redirected anything. In fact, when you 
examine the file into which you were directing the output, that file is empty 
(zero bytes long): 


$ ls -l save.it 
-rw-r--r-- 1 albing users © 2005-11-13 15:30 save.it 
$ cat save.it 


$ 


Solution 


Redirect the error output, as follows: 
gcc bad.c 2> save.it 


The contents of save.it are now the error messages that you saw before. 


Discussion 
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So what’s going on here? Every process in Unix and Linux typically starts 
out with three open file descriptors: one for input called standard input 
(STDIN), one for output called standard output (STDOUT), and one for error 
messages called standard error (STDERR). It is really up to the programmer 
who writes any particular program to stick to these conventions and write 
error messages to standard error and to the normally expected output to 
standard out, so there is no guarantee that every error message that you ever 
get will go to standard error. But most of the long-established utilities are 
well behaved this way. That is why these compiler messages are not being 
diverted with a simple > redirect; it only redirects standard output, not 
standard error. 


As mentioned in the previous recipe, each file descriptor is indicated by a 
number, starting at zero. Standard input is 0, output is 1, and error is 2. That 
means that you could redirect standard output with the slightly more verbose: 
1> (rather than a simple >) followed by a filename, but there’s no need. The 
shorthand > is fine. To redirect standard error, use 2>. 


One important difference between standard output and standard error is that 
standard output is buffered but standard error is unbuffered; that is, every 
character is written individually, and they aren’t collected together and 
written as a bunch. This means that you see the error messages right away 
and that there is less chance of them being dropped when a fault occurs, but 
the cost is one of efficiency. It’s not that standard output is unreliable, but in 
error situations (e.g., when a program dies unexpectedly), the buffered output 
may not have made it to the screen before the program stops executing. 
That’s why standard error is unbuffered: to be sure the message gets written. 
By contrast, with standard output, only when the buffer is full (or when the 
file is closed) does the output actually get written. It’s more efficient for the 
more frequently used output, but efficiency isn’t as important when an error 
is being reported. 


What if you want to see the output as you are saving it? The tee command we 
discussed in Recipe 2.16 seems just the thing: 


gcc bad.c 2>&1 | tee save.it 
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This will take standard error and redirect it to standard out, piping them both 
into tee. The tee command will write its input to both the file (save.it) and 
tee’s standard out, which will go to your screen since it isn’t otherwise 
redirected. 


This is a special case of redirecting because normally the order of the 
redirections is important. Compare these two commands: 


somecmd >my.file 2>&1 
somecmd 2>&1 >my.file 


In the first case, standard output is redirected to a file (my.file), and then 
standard error is redirected to the same place as standard out. All output will 
appear in my.file. 


But that is not the case with the second command. In the second command, 
standard error is redirected to standard output (which at that point is 
connected to the screen), after which standard output is redirected to my.file. 
Thus, only standard output messages will be put in the file, and errors will 
still show on the screen. 


However, this ordering had to be subverted for pipes—you can’t put the 
second redirect after the pipe symbol, because after the pipe comes the next 
command. So, bash makes an exception when you write: 


somecmd 2>&1 | othercmd 


and recognizes that standard output is being piped. It therefore assumes that 
you want to include standard error in the piping when you write 2>&1 even 
though its normal ordering wouldn’t work that way. 


The other result of this, and of pipe syntax in general, is that it gives us no 
way to pipe just standard error and not standard output into another command 
—unless we first swap the file descriptors (see the next recipe). 


NOTE 
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As of the 4.x versions of bash, there is a shortcut syntax for redirecting both 
standard output and standard error into a pipe. To redirect both output streams 
from somecmd into some othercmd, as shown previously, we can now use |& to 
write: 


somecmd |& othercmd 


See Also 
= Recipe 2.17, “Connecting Two Programs by Using Output as Arguments 
m Recipe 2.20, “Swapping STDERR and STDOUT” 
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2.20 Swapping STDERR and STDOUT 


Problem 


You need to swap STDERR and STDOUT so you can send STDOUT to a 
logfile, but then send STDERR to the screen and to a file using the tee 
command. But pipes only work with STDOUT. 


Solution 


Swap STDERR and STDOUT before the pipe redirection using a third file 
descriptor: 


./myscript 3>&1 1>stdout. logfile 2>&3- | tee -a stderr.logfile 


Discussion 


Whenever you redirect file descriptors, you are duplicating the open 
descriptor to another descriptor. This gives you a way to swap descriptors, 
much like how any program swaps two values—by means of a third, 
temporary holder. Copy A into C, copy B into A, copy C into B, and then you 
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have swapped the values of A and B. For file descriptors, it looks like this: 


./myscript 3>&1 1>&2 2>8&3 


Read the syntax 3>&1 as “give file descriptor 3 the same value as output file 
descriptor 1.” What happens here is that it duplicates file descriptor 1 (1.e., 
STDOUT) into file descriptor 3, our temporary holding place. Then it 
duplicates file descriptor 2 (i.e., STDERR) into STDOUT, and finally 
duplicates file descriptor 3 into STDERR. The net effect is that the STDERR 
and STDOUT file descriptors have swapped places. 


So far, so good. Now we just change this slightly. Once we’ve copied 
STDOUT (into file descriptor 3), we are free to redirect STDOUT into the 
logfile we want to have capture the output of our script or other program. 
Then we can copy the file descriptor from its temporary holding place (file 
descriptor 3) into STDERR. Adding the pipe will now work because the pipe 
connects to the (original) STDOUT. That gets us to the solution shown 
earlier: 


./myscript 3>&1 1>stdout. logfile 2>&3- | tee -a stderr.logfile 


Note the trailing - on the 2>&3- term. We do that so that we close file 
descriptor 3 when we are done with it. That way our program doesn’t have an 
extra open file descriptor. We are tidying up after ourselves. 


We’re also using the -a option to tee to append instead of replace. 


See Also 


= Linux Server Hacks, by Rob Flickenger (O’Reilly), Hack #5, “n>&m: 
Swap STDOUT and STDERR” 


= Recipe 2.19, “Saving Output When Redirect Doesn’t Seem to Work” 


= Recipe 10.1, ““Daemon-izing” Your Script” 
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2.21 Keeping Files Safe from Accidental 
Overwriting 


Problem 


You don’t want to delete the contents of a file by mistake. It can be too easy 
to mistype a filename and find that you’ve redirected output into a file that 
you meant to save. 


Solution 


Tell the shell to be more careful, as follows: 
set -o nocLlobber 


If you decide you don’t want to be so careful after all, then turn the option 
off: 


set +o noclobber 


Discussion 


The noclobber option tells bash not to overwrite any existing files when you 
redirect output. If the file to which you redirect output doesn’t (yet) exist, 
everything works as normal, with bash creating the file as it opens it for 
output. If the file already exists, however, you will get an error message. 


Here it is in action. We begin by turning the option off, just so that your shell 
is in a known state, regardless of how your particular system may be 
configured: 


$ set +0 noclobber 

$ echo something > my.file 
$ echo some more > my.file 
$ set -o noclobber (3) 
$ echo something > my.file 

bash: my.file: cannot overwrite existing file 


oe 
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$ echo some more >> my.file Q 


$ 


g The first time we redirect output to my.file the shell will create it for us. 

@ The second time we redirect, bash overwrites the file (it truncates the file 
to 0 bytes and starts writing from there). 

@ Then we set the noclobber option and we get an error message when we 
try to write to that file. 

@ As we show in the last part of this example, we can append to the file 
(using >>) just fine. 


WARNING 


Beware! The noclobber option only refers to the shell’s clobbering of a file 
when redirecting output. It will not stop other file manipulating actions of other 
programs from clobbering files (see Recipe 14.13): 


$ echo useless data > some.file 

$ echo important data > other.file 
$ set -o noclobber 

$ cp some.file other.file 


$ 


Notice that no error occurs; the file is copied over the top of an existing file. 
That copy is done via the cp command. The shell doesn’t get involved. 


If you’re a good and careful typist this may not seem like an important 
option, but we will look at other recipes where filenames are generated with 
regular expressions or passed as variables. Those filenames could be used as 
the filename for output redirection. In such cases, having noclobber set may 
be an important safety feature for preventing unwanted side effects (whether 
goofs or malicious actions). 


See Also 
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=» A good Linux reference on the chmod command and file permissions, 
such as: 
—http://www.linuxforums.org/security/file_permissions.html 
—http://www.comptechdoc.org/os/linux/usersguide/linux_ugfilesp.html 
—http://www.fags.org/docs/linux_intro/sect_03_04.html 
—http://www.perlfect.com/articles/chmod.shtml 


= Recipe 2.22, “Clobbering a File on Purpose” 


= Recipe 14.13, “Setting Permissions” 


2.22 Clobbering a File on Purpose 


Problem 


You like to have noclobber set, but every once in a while you do want to 
clobber a file when you redirect output. Can you override bash’s good 
intentions, just once? 


Solution 


Use >| to redirect your output. Even if noclobber is set, bash ignores its 
setting and overwrites the file. 


Consider this example: 


$ echo something > my.file 
$ set -o noclobber 


$ echo some more >| my.file (1 
$ cat my.file 

some more 

$ echo once again > my.file (2) 
bash: my.file: cannot overwrite existing file 

$ 


@ọ Notice that no error message occurs on the second echo. 
@ But on the third echo, when we are no longer using the vertical bar but 
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just the plain > character by itself, the shell warns us and does not clobber 
the existing file. 


Discussion 


Using noclobber does not take the place of file permissions. If you don’t 
have write permission in the directory, you won’t be able to create the file, 
whether or not you use the >| construct. Similarly, you must have write 
permission on the file itself to overwrite that existing file, whether or not you 
use the >|. 


So why the vertical bar? According to Chet, “POSIX specifies the >| syntax, 
which it picked up from Ash&&. Pm not sure why Korn chose it. csh does use 
>!.” To help you remember it, you can think of it as for emphasis. Its use in 
English (with the imperative mood) fits that sense of “do it anyway!” when 
telling bash to overwrite the file if need be. The vi and ex editors use the ! 
with that same meaning in their write (:w! filename) command. Without a 

!, the editor will complain if you try to overwrite an existing file. With it, you 
are telling the editor to “do it!” 


See Also 
= Recipe 2.21, “Keeping Files Safe from Accidental Overwriting” 


= Recipe 14.13, “Setting Permissions” 
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Chapter 3. Standard Input 


Whether it is data for a program to crunch, or simple commands to direct the 
behavior of a script, input is as fundamental as output. The first part of any 
program is the beginning of the “input/output” yin and yang of computing. 


3.1 Getting Input from a File 


Problem 


You want your shell commands to read data from a file. 


Solution 


Use input redirection, indicated by the < character, to read data from a file: 


wc < my.file 


Discussion 


Just as the > sends output to a file, so the < takes input from a file. The choice 
and shape of the characters were meant to give a visual clue as to what was 
going on with redirection. Can you see it? (Think “arrowhead. ”) 


Many shell commands will take one or more filenames as arguments, but 
when no filename is given will read from standard input. Those commands 
can be invoked as either command filename or command < filename with 
the same result. That’s the case here with wc, but also with cat and others. 


It may look like a simple feature, and be familiar if you’ve used the DOS 
command line before, but it is a significant feature of shell scripting (which 
the DOS command line borrowed) and was radical in both its power and its 
simplicity when first introduced. 
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See Also 
= Recipe 2.6, “Saving Output to Other Files” 


3.2 Keeping Your Data with Your Script 


Problem 


You need input to your script, but don’t want a separate file. 


Solution 


Use a here-document with the << characters, redirecting the text from the 
command line rather than from a file. When put into a shell script, the script 
file then contains the data along with the script. 


Here’s an example of a shell script in a file we call ext: 


$ cat ext 

# 

# here is a "here" document 
# 

grep $1 <<EOF 
mike x.123 
joe x.234 
sue x.555 
pete x.818 
sara x.822 
bill x.919 
EOF 

$ 


It can be used as a shell script for simple phone number lookups: 


$ ext bill 
bill x.919 
$ 
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or: 


$ ext 555 
sue x.555 


$ 


Discussion 


The grep command looks for occurrences of the first argument in the files 
that are named, or if no files are named it looks to standard input. 


A typical use of grep is something like this: 
grep somestring file.txt 

or: 
grep myvar *.c 


In our ext script we’ve parameterized the grep by making the string that 
we’re searching for be the parameter of our shell script ($1). Whereas we 
often think of grep as searching for a fixed string through various different 
files, here we are going to vary what we search for, but search through the 
same data every time. 


We could have put our phone numbers in a file, say phonenumbers.txt, and 
then used that filename on the line that invokes the grep command: 


grep $1 phonenumbers.txt 
However, that requires two separate files (our script and our datafile) and 


raises the question of where to put them and how to keep them together. 


So, rather than supplying one or more filenames to search through, we set up 
a here-document and tell the shell to redirect standard input to come from that 
(temporary) document. 


The << syntax says that we want to create such a temporary input source, and 
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the EOF is just an arbitrary string (you can choose what you like) to act as the 
terminator of the temporary input. It is not part of the input, but acts as the 
marker to show where it ends. The regular shell script (if any) resumes after 
the marker. 


We also might add -i to the grep command to make our search case- 
insensitive. Thus, using grep -i $1 <<EOF would allow us to search for 
“Bill” as well as “bill”. 


See Also 
m man grep 
= Recipe 3.3, “Preventing Weird Behavior in a Here-Document” 


= Recipe 3.4, “Indenting Here-Documents” 


3.3 Preventing Weird Behavior in a Here- 
Document 


Problem 


Your here-document is behaving weirdly. You wanted to maintain a simple 
list of donors using the method described previously for phone numbers, so 
you created a file called donors that looked like this: 


$ cat donors 
# 

# simple lookup of our generous donors 
# 

grep $1 <<EOF 
# name amt 
pete $100 

joe $200 

sam $ 25 
bill $ 9 

EOF 

$ 
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But when you tried running it you got weird output: 


$ ./donors bill 
pete bilLOO 
bill $ 9 

$ ./donors pete 
pete pete00 

$ 


Solution 


Turn off the shell scripting features inside the here-document by escaping any 
or all of the characters in the ending marker: 


grep $1 <<'EOF' 
pete $100 

joe $200 

sam $ 25 

bill $ 9 

EOF 


Discussion 


It’s a very subtle difference, but the <<EOF can be replaced with <<\ EOF, or 
<<'EOF', or even <<E\OF—they all work. It’s not the most elegant syntax, 
but it’s enough to tell bash that you want to treat the “here” data differently. 


Normally (i.e., unless you use this escaping syntax), says the bash manpage, 
“all lines of the here-document are subjected to parameter expansion, 
command substitution, and arithmetic expansion.” 

So what’s happening in our original donors script is that the amounts are 
being interpreted as shell variables. For example, $100 is being seen as the 
shell variable $1 followed by two zeros. That’s what gives us pete00 when 
we search for “pete” and biLLQO when we search for “bill”. 


When we escape some or all of the characters of the EOF, bash knows not to 
do the expansions, and the behavior is the expected behavior: 
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$ ./donors pete 
pete $100 


Of course, you may want the shell expansion on your data—it can be useful 
in the correct circumstances—but that isn’t what we want here. We’ve found 
it to be a useful practice to always escape the marker, as in <<'EOF' or 
<<\EOF, to avoid unexpected results, unless you know that you really want 
the expansion to be done on your data. 


WARNING 
Trailing whitespace (even just a single blank space) on your closing EOF marker 
will cause it not to be recognized as the closing marker. bash will swallow up 
the rest of your script, treating it as input too and looking for that EOF. Be sure 
there are no extra characters (especially spaces or tabs) after the EOF. 


See Also 
= Recipe 3.2, “Keeping Your Data with Your Script” 


= Recipe 3.4, “Indenting Here-Documents” 


3.4 Indenting Here-Documents 


Problem 


The here-document is great, but it’s messing up your shell script’s formatting. 
You want to be able to indent for readability. 


Solution 


Use <<-, and then you can use tab characters (only!) at the beginning of lines 
to indent this portion of your shell script: 
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$ cat myscript.sh 


grep $1 <<-'EOF' 
lots of data 
can go here 
it's indented with tabs 
to match the script's indenting 
but the leading tabs are 
discarded when read 
EOF 
ls 


Discussion 


The hyphen (-) just after the << is enough to tell bash to ignore the leading 
tab characters. This is for tab characters only and not arbitrary whitespace. 
Note that this is especially important with the EOF or any other marker 
designation. If you have spaces there, it will not recognize the EOF as your 
ending marker, and the “here” data will continue through to the end of the file 
(swallowing the rest of your script). Therefore, you may want to always left- 
justify the EOF (or other marker) just to be safe, and let the formatting go on 
this one line. 


WARNING 


Just as trailing whitespace of any kind on your closing EOF delimiter prevents it 
from being recognized as the closing delimiter (see the warning in Recipe 3.3), 
so too will using a leading character other than just the tab character. If your 
script indents with spaces or a combination of spaces and tabs, don’t use that 
technique on here-documents. Either use just tabs, or keep it all flush left. Also, 
watch out for text editors that automatically replace tabs with spaces. 


See Also 
= Recipe 3.2, “Keeping Your Data with Your Script” 
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= Recipe 3.3, “Preventing Weird Behavior in a Here-Document 


3.5 Getting User Input 


Problem 


You need to get input from the user. 
Solution 
Use the read statement: 

read 
or: 

read -p "answer me this " ANSWER 
or: 

read -t 3 -p "answer quickly: " ANSWER 
or: 


read PRE MID POST 


Discussion 


In its simplest form, a read statement with no arguments will read user input 


and place it into the shell variable REPLY. 


If you want bash to print a prompt string before reading the input, use the -p 


option. The next word following the -p will be the prompt, but quoting 
allows you to supply multiple words for a prompt. Remember to end the 
prompt with punctuation and/or a space, as the cursor will wait for input right 
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at the end of the prompt string. 


The -t option sets a timeout. The read statement will return after the 
specified number of seconds regardless of whether the user has responded. 
Our example uses both the -t and -p options together, but you can use the -t 
option on its own. As of bash version 4 you can even specify fractional 
numbers of seconds, like .25 or 3.5 for the timeout value. The exit status 
($?) will be greater than 128 if the read timed out. 


If you supply multiple variable names in the read statement, then read 
parses the input into words, assigning them in order. If the user enters fewer 
words, the extra variables will be set to null. If the user enters more words 
than there are variables in the read statement, then all of the extra words will 
be part of the last variable in the list. 


See Also 


= help read for more options to the read builtin 

= Recipe 3.8, “Prompting for a Password” 

= Recipe 6.11, “Looping with a read” 

= Recipe 13.6, “Parsing Text with a read Statement” 


= Recipe 14.12, “Validating Input” 


3.6 Getting Yes or No Input 


Problem 


You need to get a simple yes or no input from the user, and you want to be as 
user-friendly as possible. In particular, you do not want to be case-sensitive, 
and you want to provide a useful default if the user presses the Enter key. 


Solution 
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If the actions to take are simple, use the self-contained function in 
Example 3-1. 


Example 3-1. ch03/func_choose 


# cookbook filename: func_choose 


# Let the user make a choice about something and execute code based on 
# the answer 

# Called Like: choose <default (y or n)> <prompt> <yes action> <no 
action> 

# e.g. choose "y" \ 


# "Do you want to play a game?" \ 
# /usr/games/GlobalThermonuclearWar \ 
# ‘printf "%b" "See you later Professor Falkin.\n"' >&2 


# Returns: nothing 
function choose { 


local default="$1" 
local prompt="$2" 
local choice_yes="$3" 
local choice_no="$4" 
local answer 


read -p "Sprompt" answer 
[ -z "Sanswer" ] && answer="Sdefault" 


case "Sanswer" in 
[yY1] ) eval "Schoice_yes" 
# error check 
33 
[nNO] ) eval "Schoice_no" 
# error check 
33 
* ) printf "%b" "Unexpected answer 'Sanswer'!" >&2 ;; 
esac 
} # end of function choose 


If the actions are complicated, use the function in Example 3-2 and handle the 
results in your main code. 


Example 3-2. ch03/func_choice. 1 


# cookbook filename: func_choice.1 
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# Let the user make a choice about something and return a standardized 
# answer. How the default is handled and what happens next is up to 

# the if/then after the choice in main. 

# Called like: choice <prompt> 


# e. 


g. choice "Do you want to play a game?" 


# Returns: global variable CHOICE 
function choice { 


The code in Example 3-3 calls the choice function to prompt for and verify a 


CHOICE='' 
local prompt="$*" 
local answer 


read -p "Sprompt" answer 
case "Sanswer" in 
[yY1] ) CHOICE='y';; 
[nNO] ) CHOICE='n';; 
* ) CHOICE="Sanswer";; 
esac 
end of function choice 


NOTE 


If we returned “0” for no and “1” for yes, that would lend itself to interesting 
uses in if choice .. ; then expressions. We will leave that as an exercise 
for the reader. 


package date. Assuming STHISPACKAGE is set, the function displays the date 


and asks for verification. If the user types y, Y, or presses Enter, then that date 


is accepted. If the user enters a new date, the function loops and verifies it 
(for a different treatment of this problem, see Recipe 11.7). 


Example 3-3. ch03/func_choice.2 


# cookbook filename: func_choice.2 
CHOICE=' ' 
until [ "$CHOICE" = "y" |: do 


printf "%b" "This package's date is STHISPACKAGE\n" >&2 
choice "Is that correct? [Y/,<New date>]: " 
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if [ -z "SCHOICE" ]; then 
CHOICE='y' 
elif [ "SCHOICE" != "y" ]; then 
printf "%b" "Overriding $THISPACKAGE with $CHOICE\n" 
THISPACKAGE=$CHOICE 
fi 
done 


# Build the package here 


Next we’ll show different ways to handle some yes or no questions. Carefully 
read the prompts and look at the defaults. In both cases the user can simply 
hit the Enter key, and the script will then take the default the programmer 
intended: 


# If the user types anything except a case-insensitive 'n', they will 
# see the error log 
choice "Do you want to look at the error logfile? [Y/n]: " 
if [ "SCHOICE" != "n" ]; then 
less error.log 
fi 


# If the user types anything except a case-insensitive 'y', they will 
# not see the message log 
choice "Do you want to look at the message logfile? [y/N]: " 
if [ "SCHOICE" = "y" ]; then 
less message. log 
fi 


Finally, the function in Example 3-4 asks for input that might not exist. 
Example 3-4. ch03/func_choice.3 


# cookbook filename: func_choice.3 
choice "Enter your favorite color, if you have one: " 
if [ -n "SCHOICE" ]; then 

printf "%b" "You chose: $SCHOICE\n" 
else 

printf "%b" "You do not have a favorite color.\n" 
fi 
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Discussion 


Asking the user to make a decision is often necessary in scripting. For getting 
arbitrary input, see Recipe 3.5. For choosing an option from a list, see Recipe 
3-1. 


If the possible choices and the code to handle them are fairly straightforward, 
the first self-contained function is easier to use, but it’s not always flexible 
enough. The second function is flexible at the expense of having to do more 
in the main code. 


Note that we’ve sent the user prompts to STDERR so that the main script 
output on STDOUT may be redirected without the prompts cluttering it up. 


See Also 


= Recipe 3.5, “Getting User Input” 
m Recipe 3.7, “Selecting from a List of Options” 
= Recipe 11.7, “Figuring Out Date and Time Arithmetic” 


3.7 Selecting from a List of Options 


Problem 


You need to provide the user with a list of options to choose from and you 
don’t want to make them type any more than necessary. 


Solution 


Use bash’s builtin select construct to generate a menu, then have the user 
choose by typing the number of the selection (see Example 3-5). 


Example 3-5. ch03/select_dir 


# cookbook filename: select_dir 


directorylist="Finished $(for i in /*;do [ -d "Si" ] && echo $i; done)" 
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PS3='Directory to process? ' # Set a useful select prompt 
until [ "Sdirectory" == "Finished" ]; do 


printf "%b" "\a\n\nSelect a directory to process:\n" >&2 
select directory in $directorylist; do 


# User types a number which is stored in SREPLY, but select 
# returns the value of the entry 


if [ "Sdirectory" == "Finished" ]; then 
echo "Finished processing directories." 
break 


elif [ -n "Sdirectory" ]; then 
echo "You chose number SREPLY, processing directory..." 
# Do something here 
break 
else 
echo "Invalid selection!" 
fi # end of handle user's selection 


done # end of select a directory 


done # end of until dir == finished 


Discussion 


The select statement makes it trivial to present a numbered list to the user 
on STDERR, from which they may make a choice. Don’t forget to provide an 


“exit” or “finished” choice, though Ctrl-D will end the select and empty 
input will print the menu again. 


The number the user typed is returned in $REPLY, and the value of that entry 
is returned in the variable you specified in the select construct. 


See Also 


m help select 
m help read 
= Recipe 3.6, “Getting Yes or No Input” 
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3.8 Prompting for a Password 


Problem 


You need to prompt the user for a password, but you don’t want it echoed on 
the screen. 


Solution 


Use the read command to read the user’s input, but with a special option to 
turn off echoing: 


read -s -p "password: " PASSWD 


printf "%b" "\n" 


Discussion 


The -s option tells the read command not to echo the characters typed (s is 
for silent) and the -p option says that the next argument is the prompt to be 
displayed prior to reading input. 

The line of input that is read from the user is put into the variable named 
SPASSWD. 


We follow the read with a printf to print out a newline. The printf is 
necessary because read -s turns off the echoing of characters. With echoing 
disabled, when the user presses the Enter key no newline is echoed and any 
subsequent output would appear on the same line as the prompt. Printing the 
newline gets us to the next line, as you would expect. It may even be handy 
for you to write the code all on one line to avoid intervening logic (putting it 
on one line also prevents mistakes should you cut and paste this line 
elsewhere): 


read -s -p "password: " PASSWD ; printf "%b" "\n" 


Be aware that if you read a password into an environment variable it is in 
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memory in plain text, and thus may be accessed via a core dump or 
/proc/core (if your OS provides /proc/). It is also in the process environment, 
which may be accessible by other processes. You may be better off using 
certificates with SSH, if possible. In any case, it is wise to assume that root 
and possibly other users on the machine may gain access to the password, so 
you should handle the situation accordingly. 


WARNING 


Some older scripts may use stty -echo to disable the screen echo while a 
password is being entered. The problem with that is if the user breaks the script, 
echo will still be off. Experienced users will know to type stty sane to fix it, 
but it’s very confusing. If you still need to use this method, set a trap to turn 
echo back on when the script terminates. See Recipe 10.6. 


See Also 


m help read 
= Recipe 10.6, “Trapping Interrupts” 


Recipe 14.14, “Leaking Passwords into the Process List” 


Recipe 14.20, “Using Passwords in Scripts” 
Recipe 14.21, “Using SSH Without a Password” 
Recipe 19.9, “Making Your Terminal Sane Again” 
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Chapter 4. Executing Commands 


The main purpose of bash (or of any shell) is to allow you to interact with the 
computer’s operating system so that you can accomplish whatever you need 
to do. Usually that involves launching programs, so the shell takes the 
commands you type, determines from that input what programs need to be 
run, and launches them for you. 


Let’s take a look at the basic mechanism for launching jobs and explore some 
of the features bash offers for launching programs in the foreground or the 
background, sequentially or in parallel, indicating whether they succeeded, 
and more. 


4.1 Running Any Executable 


Problem 


You need to run a command on a Linux or Unix system. 


Solution 


Use bash and type the name of the command at the prompt: 


$ someprog 


Discussion 


This seems rather simple, and in a way it is, but a lot goes on behind the 
scenes that you never see. What’s important to understand about bash is that 
its basic operation is to load and execute programs. All the rest is just 
window dressing to get ready to run programs. Sure, there are shell variables 
and control statements for looping and if /then/else branching, and there 


114 


are ways to control input and output, but they are all icing on the cake of 
program execution. 


So where does it get the program to run? 


bash uses a shell variable called $PATH to locate your executable. The $PATH 
variable is a list of directories. The directories are separated by colons (:). 
bash searches in each of those directories for a file with the name that you 
specified. The order of the directories is important—bash looks at the order 
in which the directories are listed in the variable, and takes the first 
executable found: 


$ echo $PATH 
/bin: /usr/bin: /usr/local/bin:. 
$ 


In the $PATH variable shown here, four directories are included. The last 
directory in the list is just a single dot (called the dot directory, or just dot), 
which represents the current directory on a Linux or Unix filesystem— 
wherever you are, that’s the directory to which dot refers. For example, when 
you copy a file from someplace to dot (1.e., cp /other/place/file . ), 
you are copying the file into the current directory. Listing the dot directory in 
your path tells bash to look for commands not just in those other directories, 
but also in the current directory (.). 


Many people feel that putting dot in the $PATH is too great a security risk— 
someone could trick you and get you to run their own malicious version of a 
command (say, /s) in place of one that you were expecting. If dot were listed 
first, then someone else’s version of /s would supersede the normal /s 
command, and you might unwittingly run that command. Don’t believe us? 
Try this: 


$ bash 

$ cd 

$ touch ls 

$ chmod 755 ls 

$ PATH=". :SPATH" 
$ ls 
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Suddenly, the /s appears not to work in your home directory. You get no 
output. When you cd to some other location (e.g., cd /tmp), then /s will 
work, but not in your home directory. Why? Because in that directory there is 
an empty file called /s that is run (and does nothing—it’s empty) instead of 
the normal /s command located at /bin/Is. Since we started this example by 
running a new copy of bash, you can exit from this mess by exiting this 
subshell—but you might want to remove the bogus /s command first: 


$ cd 
$ rm ls 
$ exit 


$ 
Can you see the potential danger of wandering into a strange directory with 
your PATH set to search the dot directory before anywhere else? 


If you put dot as the last directory in your $PATH variable, at least you won’t 
be tricked that easily. Of course, if you leave it off altogether it is arguably 
even safer, and you can still run commands in your local directory by typing 
a leading dot and slash character, as in: 


./myscript 


The choice is yours. 


WARNING 


Never allow dot or writable directories in root’s $PATH. For more on this topic, 
see Recipe 14.9 and Recipe 14.10. 


Noa J 


Don’t forget to set execute permissions on the file before you invoke your 
script: 
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chmod +x myscript 


You only need to set the permissions once. Thereafter, you can invoke the 
script as a command. 


A common practice among some bash users is to create a personal bin 
directory, analogous to the system directories /bin and /usr/bin where 
executables are kept. In your personal bin (if you create it in your home 
directory, its path is ~/bin) you can put copies of your favorite shell scripts 
and other customized or private commands. Then add that directory to your 
SPATH, even to the front (PATH=~/bin:$PATH). That way, you can still have 
your own customized favorites without the security risk of running 
commands from strangers. 


See Also 

= Chapter 16 for more on customizing your environment 

= Recipe 1.5, “Finding and Running Commands” 

= Recipe 14.9, “Finding World-Writable Directories in Your $PATH” 
= Recipe 14.10, “Adding the Current Directory to the $PATH” 

= Recipe 16.11, “Keeping a Private Stash of Utilities by Adding ~/bin” 


= Recipe 19.1, “Forgetting to Set Execute Permissions” 


4.2 Running Several Commands in Sequence 


Problem 


You need to run several commands, but some take a while and you don’t 
want to wait for each one to finish before issuing the next command. 


Solution 


There are three solutions to this problem, although the first is rather trivial: 
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just keep typing. A Linux or Unix system is advanced enough to be able to let 
you type while it works on your previous commands, so you can simply keep 
typing one command after another. 


Another rather simple solution is to type those commands into a file and then 
tell bash to execute the commands in the file—t.e., a simple shell script. For 
example, assume that we want to run three commands, long, medium, and 
short, each of whose execution time is reflected in its name. We need to run 
them in that order, but don’t want to wait around for the long script to finish 
before typing the other commands. We could use a shell script (a.k.a. batch 
file). Here’s a primitive way to do that: 


$ cat > simple.script 

long 

medium 

short 

AD # Ctrl-D, not visible 
$ bash ./simple.script 


The third, and arguably best, solution is to run each command in sequence. If 
you want to run each program regardless of whether the preceding ones fail, 
separate them with semicolons: 


long ; medium ; short 


If you only want to run the next program if the preceding program worked, 
and all the programs correctly set exit codes, separate them with double 
ampersands: 


long && medium && short 


Discussion 


The cat example was just a very primitive way to enter text into a file: we 
redirected the output from the command into the file named simple.script (for 
more on redirecting output, see Chapter 2). Better you should use a real 
editor, but such things are harder to show in examples like this. From now on, 
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when we want to show a script, we’ ll either just show the text as disembodied 
text not on a command line, or start the example with a command like cat 
filename to dump the contents of the file to the screen (rather than 
redirecting output from our typing into the file), and thus display it in the 
example. 


The main point of this simple solution is to demonstrate that more than one 
command can be put on the bash command line. In the first case the second 
command isn’t run until the first command exits, the third doesn’t execute 
until the second exits, and so on, for as many commands as you have on the 
line. In the second case the second command isn’t run unless the first 
command succeeds, the third doesn’t execute unless the second succeeds, and 
so on, for as many commands as you have on the line. 


See Also 


= Chapter 2 on redirecting output 


4.3 Running Several Commands All at Once 


Problem 


You need to run three commands, but they are independent of each other and 
don’t need to wait for the previous ones to complete. 


Solution 


You can run a command in the background by putting an ampersand (&) at 
the end of the command line. Thus, you could fire off all three commands in 
rapid succession as follows: 


$ long & 
[1] 4592 
$ medium & 
[2] 4593 
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$ short 
$ 


Or better yet, you can do it all on one command line: 


$ long & medium & short 
[1] 4592 

[2] 4593 

$ 


Discussion 


When we run a command “in the background” (there really is no such place 
in Linux), all that really means is that we disconnect keyboard input from the 
command and the shell doesn’t wait for the command to complete before it 
gives another prompt and accepts more command input. Output from the 
command (unless we take explicit action to change this behavior) will still 
come to the screen, so in this example all three commands will be 
interspersing output to the screen. 


The odd bits of numerical output are the job number in square brackets, 
followed by the process ID of the command that we just started in the 
background. In our example, job 1 (process 4592) is the Jong command, and 
job 2 (process 4593) is medium. 


We didn’t put short into the background since we didn’t put an ampersand at 
the end of the line, so bash will wait for it to complete before giving us the 
shell prompt (the $). 


The job number or process ID can be used to provide limited control over a 
job. For example, we could kill the Jong job with kill %1 (since its job 
number was 1), or we could specify the process number (1.e., kill 4592) 
with the same deadly results. 


You can also use the job number to reconnect to a background job. For 
instance, we could connect the /ong job back to the foreground with fg %1. If 
you only have one job running in the background, you don’t even need the 
job number; just use fg by itself. 
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TIP 
If you run a command and then realize it will take longer to complete than you 
thought, you can pause it using Ctrl-Z, which will return you to a prompt. You 
can then type bg to unpause the job and continue running it in the background. 
This is basically adding a trailing & after the fact. 


4.4 Telling Whether a Command Succeeded or Not 


Problem 


You need to know whether the command you ran succeeded. 


Solution 


The shell variable $? is set with a nonzero value if the command fails— 
provided that the programmer who wrote that command or shell script 
followed the established convention: 


$ somecommand 
# it works... 
$ echo $? 

0 

$ badcommand 

# it fails... 
$ echo $? 

1 

$ 


Discussion 


The exit status of a command is kept in the shell variable referenced with $?. 
Its value can range from 0 to 255. When you write a shell script, it’s a good 
idea to have your script exit with zero if all is well and a nonzero value if you 
encounter an error condition. We recommend using only 0 to 127 because the 
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shell uses 128+WN to denote killed by signal N. Also, if you use a number 
greater than 255 or less than 0, the numbers will wrap around. You return an 
exit status with the exit statement (e.g., exit 1 or exit 0). But be aware 
that you only get one shot at reading a command’s exit status: 


$ badcommand 
# it fails... 
$ echo $? 

1 

$ echo $? 

0 

$ 


Why does the second echo give us 0 as a result? It’s actually reporting on the 
status of the immediately preceding echo command. The first time we typed 
echo $? it returned a 1, which was the return value of badcommand. But the 
echo command itself succeeds, and therefore the new, most recent status is 
success (1.e., a 0 value). Because you only get one chance to check the exit 
status, many shell scripts will immediately assign the status to another shell 
variable, as in: 


$ badcommand 
# it fails... 
$ STAT=$? 

$ echo SSTAT 
1 

$ echo SSTAT 
1 

$ 


We can keep the value around in the variable $STAT and check its value later 
on. 


Although we’re showing this in command-line examples, the real use of 
variables like $? comes in writing scripts. You can usually see whether a 
command worked or not if you are watching it run on your screen. But in a 
script, the commands may be running unattended. 


122 


One of the great features of bash is that the scripting language is identical to 
commands as you type them at a prompt in a terminal window. This makes it 
much easier to check out syntax and logic as you write your scripts. 


The exit status is more often used in scripts, and often in if statements, to 
take different actions depending on the success or failure of a command. 
Here’s a simple example for now, but we will revisit this topic in future 
recipes: 


somecommand 


if (( $? )) ; then echo failed ; else echo OK; fi 


(( )) evaluates an arithmetic expression; see Recipes 6.1 and 6.2. 
We also do not recommend using negative numbers. The shell will accept 
them without an error, but it won’t do what you expect: 


$ bash -c ‘exit -2' ; echo $? 
254 


$ bash -c 'exit -200' ; echo $? 
56 


See Also 


m Recipe 4.5, “Running a Command Only if Another Command Succeeded” 
= Recipe 4.8, “Displaying Error Messages When Failures Occur” 
= Recipe 6.1, “Doing Arithmetic in Your Shell Script” 


= Recipe 6.2, “Branching on Conditions” 


4.5 Running a Command Only if Another Command 
Succeeded 


Problem 
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You need to run some commands, but you only want to run certain 
commands if certain other ones succeed. For example, you’d like to change 
directories (using the cd command) into a temporary directory and remove all 
the files. However, you don’t want to remove any files if the cd fails (e.g., if 
permissions don’t allow you into the directory, or if you spell the directory 
name wrong). 


Solution 


You can use the exit status ($?) of the cd command in combination with an 
if statement to do the rm only if the cd was successful: 


cd mytmp 
if (( $? == 0 )); then rm * ; fi 


TIP 


A better way to write this is the following, but we think it’s more clear to show 
and explain as we did: 


if cd mytmp; then rm * ; fi 


Discussion 


Obviously, you wouldn’t need to do this if you were typing the commands by 
hand. You would see any error messages from the cd command, and thus you 
wouldn’t type the rm command. But scripting is another matter, and this test 
is very well worth doing in a script like our example to make sure that you 
don’t accidentally erase all the files in the directory where you are running it. 


Let’s say you ran that script from the wrong directory, one that didn’t have a 
subdirectory named mytmp. The cd would fail, so the current directory would 
remain unchanged. Without the if check (for the cd having failed) the script 
would just continue on to the next statement. Running the rm * would 
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remove all the files in your current directory. Ouch. The if is worth it. 


So how does $? get its value? It is the exit code of the command (see Recipe 
4.4). C language programmers will recognize this as the value of the 
argument supplied to the exit() function; e.g., exit(4); would return a 4. 
For the shell, an exit code of zero is considered success and a nonzero value 
means failure. 


If you’re writing bash scripts, you’ ll want to be sure to explicitly set return 
values, so that $? is set properly from your script. If you don’t, the value set 
will be the value of the last command run, which you may not want as your 
result. 


See Also 
= Recipe 4.4, “Telling Whether a Command Succeeded or Not” 


= Recipe 4.6, “Using Fewer if Statements” 


4.6 Using Fewer if Statements 


Problem 


As a conscientious programmer, you took to heart what we described in the 
previous recipe. You applied the concept to your latest shell script, but now 
you find that the shell script is unreadable, with all those if statements 
checking the return code of every command. Isn’t there an alternative? 


Solution 


Use the double-ampersand operator in bash to provide conditional execution: 


cd mytmp && rm * 


Discussion 
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Separating two commands by the double ampersands tells bash to run the 
first command and then to run the second command only if the first command 
succeeds (1.e., its exit status is 0). This is very much like using an if 
statement to check the exit status of the first command in order to protect the 
running of the second command: 


cd mytmp 
if (( $? == 0 )); then rm * ; fi 


The double-ampersand syntax is meant to be reminiscent of the logical AND 
operator in the C language. If you know your logic (and your C) then you’! 
recall that if you are evaluating the logical expression A AND B, the entire 
expression can only be true if both (sub)expression A and (sub)expression B 
evaluate to true. If either one is false, the whole expression is false. The C 
language makes use of this fact, and when you code an expression like if (A 
&& B) { ... }, it will evaluate expression A first. If it is false, it won’t even 
bother to evaluate B since the overall outcome (false) has already been 
determined (by A being false). 


So what does this have to do with bash? Well, if the exit status of the first 
command (the one to the left of the &&) is nonzero (1.e., failed), then it won’t 


bother to evaluate the second expression—it won’t run the other command at 
all. 


If you want to be thorough about your error checking, but don’t want if 
statements all over the place, you can have bash exit any time it encounters a 
failure (i.e., a nonzero exit status) from every command in your script (except 
in while loops and if statements where it is already capturing and using the 
exit status) by setting the -e flag: 


set -e 
cd mytmp 
rm * 


Setting the -e flag will cause the shell to exit when a command fails. If the cd 
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in this example fails, the script will exit and never even try to execute the rm 
* command. We don’t recommend doing this on an interactive shell, 
however, because when the shell exits it will make your shell window go 
away. 


See Also 


= Recipe 4.8, “Displaying Error Messages When Failures Occur” for an 
explanation of the | | syntax, which is similar in some ways to but also 
quite different from the && construct 


4.7 Running Long Jobs Unattended 


Problem 


You ran a job in the background, then exited the shell and went for coffee. 
When you came back to check, the job was no longer running and it hadn’t 
completed. In fact, your job hadn’t progressed very far at all. It seems to have 
quit as soon as you exited the shell. 


Solution 
If you want to run a job in the background and expect to exit the shell before 


the job completes, then you need to nohup the job: 


$ nohup long & 
nohup: appending output to “nohup.out' 


Discussion 


When you put a job in the background (via the & as described in Recipe 4.3), 
it is still a child process of the bash shell. When you exit an instance of the 
shell, bash sends a hangup (hup) signal to all of its child processes. That’s 
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why your job didn’t run for very long. As soon as you exited bash, it killed 
your background job. (Hey, you were leaving; how was it supposed to 
know?) 


The nohup command simply sets up the child process to ignore hangup 
signals. You can still kill the job with the kill command, because kill sends a 
SIGTERM signal, not a SIGHUP signal. But with nohup, bash won’t 
inadvertently kill your job when you exit. 


The message that nohup gives about appending your output is just nohup 
trying to be helpful. Since you are likely to exit the shell after issuing a nohup 
command, your output destination will likely go away—.e., the bash session 
in your terminal will no longer be active, so the job won’t be able to write to 
STDOUT. More importantly, writing to a nonexistent destination would 
cause a failure. So nohup redirects the output for you, appending it (not 
overwriting, but adding at the end) to a file named nohup.out in the current 
directory. You can explicitly redirect the output elsewhere on the command 
line, and nohup is smart enough to detect that this has happened and not use 
nohup.out for your output. 


See Also 


= Chapter 2 for various recipes on redirecting output, since you probably 
want to do that for a background job 


= Recipe 4.3, “Running Several Commands All at Once” 


= Recipe 10.1, ““Daemon-izing” Your Script” for more on running your 
script unattended 


= Recipe 17.4, “Recovering Disconnected Sessions Using screen” 


4.8 Displaying Error Messages When Failures 
Occur 


Problem 


128 


You need your shell script to be verbose about failures. You want to see error 
messages when commands don’t work, but if statements tend to distract 
from the visual flow of statements. 


Solution 


A common idiom among some shell programmers is to use the | | with 
commands to spit out debug or error messages. Here’s an example: 


cmd || printf "%b" "cmd failed. You're on your own\n" 


Discussion 


Similar to how the && in Recipe 4.6 tells bash not to bother to evaluate the 
second expression if the first one is false, the | | tells the shell not to bother to 
evaluate the second expression if the first one is true (1.e., succeeds). As with 
&&, the | | syntax harkens back to logic and the C language, where the 
outcome is determined (as true) if the first expression in A OR B evaluates to 
true—so there’s no need to evaluate the second expression. In bash, if the 
first expression returns 0 (1.e., succeeds) then it just continues on. Only if the 
first expression returns a nonzero value (i.e., if the exit value of the command 
indicates failure) must it evaluate the second part, and thus run the other 
command. 


Warning—don’t be fooled by this: 
cmd || printf "%b" "FAILED.\n" ; exit 1 


The exit will be executed in either case! The OR 1s only between the first two 
commands. If we want to have the exit happen only on error, we need to 
group it with the printf so that both are considered as a unit. The desired 
syntax would be: 


cmd || { printf "%b" "FAILED.\n" ; exit 1 ; } 
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Note that the semicolon after the last command and just before the } is 
required, and that the closing brace must be separated by whitespace from the 
surrounding text. See Recipe 2.14 for a discussion. 


See Also 


= Recipe 2.14, “Saving or Grouping Output from Several Commands” 


= Recipe 4.6, “Using Fewer if Statements” for an explanation of && syntax 


4.9 Running Commands from a Variable 


Problem 


You want to run different commands in your script depending on 
circumstances. How can you vary which commands run? 


Solution 


There are many solutions to this problem—it’s what scripting is all about. In 
coming chapters we’ll discuss various programming constructs that can be 
used to solve this problem, such as if /then/else, case statements, and 
more. But here’s a slightly different approach that reveals something about 
bash. We can use the contents of a variable (more on those in Chapter 5) not 
just for parameters, but also for the command itself: 


FN=/tmp/x.x 
PROG=echo 
SPROG SFN 
PROG=cat 
SPROG SFN 


Discussion 


We can assign the program name to a variable (here we use $PROG), and then 
when we refer to that variable in the place where a command name would be 
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expected, bash uses the value of that variable (SPROG) as the command to run. 
It parses the command line, substitutes the values of its variables, and takes 
the result of all the substitutions and treats that as the command line, as if it 
had been typed that way verbatim. 


WARNING 


Be careful about the variable names you use. Some programs, such as InfoZip, 
use environment variables such as $ZIP and $UNZIP to pass settings to the 
program itself, so if you do something like ZIP=/usr/bin/zip you can spend 
days pulling your hair out wondering why it works fine from the command line, 
but not in your script. Trust us. We learned this one the hard way. Also, RTFM. 


See Also 

= Chapter 11 

= Recipe 14.3, “Setting a Secure $PATH” 

= Recipe 16.21, “Creating Self-Contained, Portable rc Files” 

= Recipe 16.22, “Getting Started with a Custom Configuration” 


=» Appendix C for a description of all the various substitutions that are 
performed on a command line; you’! want to read a few more chapters 
before tackling that subject 


4.10 Running All Scripts in a Directory 


Problem 


You want to run a series of scripts, but the list keeps changing; you’re always 
adding new scripts, but you don’t want to continuously modify a master list. 


Solution 
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Put the scripts you want to run in a directory, and let bash run everything that 
it finds. Instead of keeping a master list, simply use the contents of that 
directory as your master list. Here’s a script that will run everything it finds 
in a particular directory: 


for SCRIPT in /path/to/scripts/dir/* 
do 
if [ -f "SSCRIPT" -a -x "SSCRIPT" ] 
then 
SSCRIPT 
fi 
done 


Discussion 


We discuss the for loop and the if statement in greater detail in Chapter 6, 
but this gives you a taste. The variable $SCRIPT will take on successive 
values for each file that matches the wildcard pattern *, which matches 
everything in the named directory (except invisible dot files, whose names 
begin with a period). If it is a file (the -f test) and has execute permissions 
set (the -x test), the shell will then try to run that script. 


In this simple example, we have provided no way to specify any arguments to 
the scripts as they are executed. This simple script may work well for your 
personal needs, but wouldn’t be considered robust; some might consider it 
downright dangerous. But we hope it gives you an idea of what lies ahead: 
some programming language-—style scripting capabilities. 


See Also 


= Chapter 6 for more about for loops and if statements 
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Chapter 5. Basic Scripting: 
Shell Variables 


bash shell programming is a lot like any kind of programming, and that 
includes having variables—containers that hold strings and numbers, which 
can be changed, compared, and passed around. bash variables have some 
very special operators that can be used when you refer to a variable. bash also 
has some important built-in variables, ones that provide important 
information about the other variables in your script. This chapter takes a look 
at bash variables and some special mechanisms for referencing variables, and 
shows how they can be put to use in your scripts. 


Variables in a bash script are often written as all-uppercase names, though 
that is not required—just a common practice. You don’t need to declare 
them; just use them where you want them. They are basically all of type 
string, though some bash operations can treat their contents as a number. 
They look like this in use: 


# trivial script using shell variables 
# (but at least it is commented! ) 
MYVAR="something" 

echo SMYVAR 

# similar but with no quotes 
MY_2ND=anotherone 

echo $MY_2ND 

# quotes are needed here: 
MYOTHER="more stuff to echo" 

echo SMYOTHER 


There are two significant aspects of bash variable syntax that may not be 
intuitively obvious. First, in the assignment, the name=value syntax is 
straightforward enough, but there cannot be any spaces around the equals 
sign. 
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Let’s consider for a moment why this is the case. Remember that the main 
purpose of the shell is to launch programs—you name the program on the 
command line and that is the program that gets launched. Any words of text 
that follow that name on the command line are passed along as arguments to 
the program. For example, when you type: 


ls filename 


the word ls is the name of the command, and filename is the first and only 
argument in this example. 


Why is that relevant? Well, consider what a variable assignment would look 
like if you allowed spaces around the equals sign, like this: 


MYVAR = something 


Can you see that the shell would have a hard time distinguishing between the 
name of a command to invoke (like in the /s example) and the assignment of 
a variable? This would be especially true for commands that can use = 
symbols as one or more of their arguments (e.g., test). So to keep it simple, 
the shell doesn’t allow spaces around the equals sign in an assignment. The 
flip side of this is also worth noting—don’t use an equals sign in a filename, 
especially not one for a shell script (it is possible, just not recommended). 


The second aspect of shell variable syntax worth noting is the use of the 
dollar sign ($) when referring to a variable. You don’t use the dollar sign on 
the variable name to assign it a value, but you do use the dollar sign to get the 
value of the variable. (The exception to this is using variables inside a $(( 

)) expression.) In compiler jargon, this difference in syntax for assigning and 
retrieving the value is the difference between the Z-value and the R-value of 
the variable (for Left and Right side of an assignment operator). 


Once again, the reason for this is simple disambiguation. Consider the 
following: 


MYVAR=something 
echo MYVAR is now MYVAR 
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As this example tries to point out, how would one distinguish between the 
literal string MYVAR and the value of the SMYVAR variable? Use quotes, you 
say? If you were to require quoting around literal strings, then everything 
would get a lot messier—you would have to quote every non-variable name, 
which includes commands! Who wants to type: 


"ts" "-1L" "/usr/bin/xmms" 


(Yes, for those of you who thought about trying it, it does work.) So rather 
than have to put quotes around everything, the onus is put on the variable 
reference by using the R-value syntax. Put a dollar sign on a variable name 
when you want to get at the value associated with that variable name: 


MYVAR=something 
echo MYVAR is now SMYVAR 


Just remember that since everything in bash is a string, we need the dollar 
sign to indicate a variable reference. We may also want to add braces around 
the variable name, for reasons we describe in Recipe 5.4. 


5. 1 Documenting Your Script 


Problem 


Before we say one more word about shell scripts or variables, we have to say 
something about documenting your scripts. After all, you need to be able to 
understand your script even when several months have passed since you 
wrote it. 


Solution 


Document your script with comments. The # character denotes the beginning 
of a comment. All the characters after it on that line are ignored by the shell: 
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# 

# This is a comment. 

# 

# Use comments frequently. 

# Comments are your friends. 


Discussion 

Some people have described shell syntax, regular expressions, and other parts 
of shell scripting as write-only syntax, implying that it is nearly impossible to 
understand the intricacies of many shell scripts. 

One of your best defenses against letting your shell scripts fall into this trap is 


the liberal use of comments (another is the use of meaningful variable 
names). It helps to put a comment before strange syntax or terse expressions: 


# replace the semicolon with a space 
NEWPATH=${PATH/;/ } 

# 

# switch the text on either side of a semicolon 


sed -e 'S/*\(.*\)3\(C.*\)$/\23\1/" < SFILE 
Comments can even be typed in at the command prompt with an interactive 


shell. This feature can be turned off, but it is on by default. There may be a 
few occasions when it is useful to make interactive comments. 


See Also 
= Table 5-1 


= “shopt Options” in Appendix A to learn how to turn interactive comments 
on or off 


5.2 Embedding Documentation in Shell Scripts 


Problem 
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You want a simple way to provide formatted end-user documentation (e.g., 
manpages or HTML pages) for your script. You want to keep both code and 
documentation markup in the same file to simplify updates, distribution, and 
revision control. 


Solution 


Embed documentation in the script using the “do nothing” builtin (a colon) 
and a here-document, as illustrated in Example 5-1. 


Example 5-1. ch05/embedded_documentation 


#!/usr/bin/env bash 
# cookbook filename: embedded_documentation 


echo 'Shell script code goes here' 


# Use a : NOOP and here document to embed documentation, 
: <<'END_OF_DOCS' 


Embedded documentation such as Perl's Plain Old Documentation (POD), 
or even plain text here. 


Any accurate documentation is better than none at all. 
Sample documentation in Perl's Plain Old Documentation (POD) format 
adapted from 
CODE/ch07/Ch07.001_Best_Ex7.1 and 7.2 in the Perl Best Practices example 
tarball 
"PBP_code.tar.gz". 
=head1 NAME 
MY~PROGRAM- -One-line description here 
=head1 SYNOPSIS 
MY~PROGRAM [OPTIONS] <file> 


=head1 OPTIONS 


-h = This usage. 
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-v = Be verbose. 
-V = Show version, copyright, and License information. 


=headi DESCRIPTION 


A full description of the application and its features. 
May include numerous subsections (i.e., =head2, =head3, etc.) 


loses] 


=head1 LICENSE AND COPYRIGHT 
=cut 
END_OF_DOCS 


Then, to extract and use that POD documentation, try these commands: 


# To read on-screen, automatically paginated 
$ perldoc myscript 


# Just the "usage" sections 
$ pod2usage myscript 


# Create an HTML version 
$ pod2html myscript > myscript.html 


# Create a manpage 
$ pod2man myscript > myscript.1 


Discussion 


Any plain-text documentation or markup can be used this way, either 
interspersed throughout the code, or better yet, collected at the end of the 
script. Since computer systems that have bash will probably also have Perl, 
its Plain Old Documentation (POD) format may be a good choice. Perl 
usually comes with pod2* programs to convert POD to HTML, LaTeX, 
manpage, text, and usage files. 
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Damian Conway’s Perl Best Practices (O’Reilly) has some excellent library 
module and application documentation templates that could be easily 
translated into any documentation format, including plain text. See 
CODE/ch07/Ch07.001_Best_Ex7.1 and 7.2 in that book’s examples tarball. 


If you keep all of your embedded documentation at the very bottom of the 
script, you could also add an exit O right before the documentation begins. 
That will simply exit the script rather than forcing the shell to parse each line 
looking for the end of the here-document, so it will be a little faster. You 
need to be careful not to override a previous exit code from a command that 
failed, though, so consider using set -e. And do not use this trick if you 
intersperse code and embedded documentation in the body of the script. 


See Also 


=m set -ein Recipe 4.6, “Using Fewer if Statements” 


= /ttp://examples.oreilly.com/perlbp/PBP_code.tar.gz 


9:3% Promoting Script Readability 


Problem 


You’d like to make your script as readable as possible for ease of 
understanding and future maintenance. 


Solution 

Follow these best practices: 

= Document your script as noted in Recipe 5.1 and Recipe 5.2. 
= Indent and use vertical whitespace wisely. 

=» Use meaningful variable names. 


= Use functions (Recipe 10.4), and give them meaningful names. 
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= Break lines at meaningful places at less than 76 characters or so. 


= Put the most meaningful bits to the left. 


Discussion 


Document your intent, not the trivial details of the code. If you follow the rest 
of the points, the code should be pretty clear. Write reminders, provide 
sample data layouts or headers, and make a note of all the details that are in 
your head now, as you write the code. Document the code itself too if it is 
subtle or obscure. 


We recommend indenting using four spaces per level, with no tabs and 
especially no mixed tabs and spaces. There are many reasons for this, though 
it often is a matter of personal preference or company standards. Four spaces 
is always four spaces, no matter how your editor (excepting proportional 
fonts) or printer is set. It’s big enough to be easily visible as you glance 
across the script but small enough that you can have several levels of 
indenting without running the lines off the right side of your screen or printed 
page. We also suggest indenting continued lines with two additional spaces, 
or as needed, to make the code the most clear. 


Use vertical whitespace, with separators if you like them, to create blocks of 
similar code. Of course, you’ll do that with functions as well. 


Use meaningful names for variables and functions, and spell them out. The 
only time $i or $x is ever acceptable is in a for loop. You may think that 
short, cryptic names are saving you time and typing now, but we guarantee 
that you will lose that time 10- or 100-fold somewhere down the line when 
you have to fix or modify your script. 


Break long lines at around 76 characters. Yes, we know that most screens (or 
rather, terminal programs) can handle a lot more than that, but 80-character 
paper and screens are still the default, and it never hurts to have some 
whitespace to the right of the code. Constantly having to scroll to the right or 
having lines wrap awkwardly on the screen or printout is annoying and 
distracting. Don’t cause it. 
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Unfortunately, there are sometimes exceptions to the long line rule. When 
creating lines to pass elsewhere, perhaps via Secure Shell (SSH), and in 
certain other cases, breaking up the line can cause many more code 
headaches than it solves. But in most cases, it makes sense. 


Try to put the most meaningful bits to the left when you break a line—we 
read shell code left-to-right, so the unusual fact of a continued line will stand 
out more. It’s also easier to scan down the left edge of the code for continued 
lines, should you need to find them. Which is more clear? 


# Good 
[ -n "Sresults" ] \ 
&& echo "Got a good result in $results" \ 
|| echo 'Got an empty result, something is wrong' 


# Also good 
[ -n "Sresults" ] && echo "Got a good result in $results" \ 
|| echo 'Got an empty result, something is wrong' 


# OK, but not ideal 
[ -n "Sresults" ] && echo "Got a good result in $results" \ 
|| echo 'Got an empty result, something is wrong' 


# Bad 
[ -n "Sresults" ] && echo "Got a good result in $results" 
echo 'Got an empty result, something is wrong' 


| \ 


# Bad (trailing \s are optional here, but recommended for clarity) 
[ -n "Sresults" ] && \ 

echo "Got a good result in $results" || \ 

echo 'Got an empty result, something is wrong' 


See Also 


= Recipe 5.1, “Documenting Your Script” 
= Recipe 5.2, “Embedding Documentation in Shell Scripts” 


= Recipe 10.4, “Defining Functions” 
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5.4 Separating Variable Names from Surrounding 
Text 


Problem 


You need to print a variable along with other text. You are using the dollar 
sign in referring to the variable, but how do you distinguish the end of the 
variable name from other text that follows? For example, say you wanted to 
use a Shell variable as part of a filename, as in: 


for FNini2345 
do 

somescript /tmp/repSFNport.txt 
done 


How will the shell read that? It will think that the variable name starts with 
the $ and ends with the punctuation. In other words, it will think that 
SFNport is the variable name, not the intended $FN. 


Solution 


Use the full syntax for a variable reference, which includes not just the dollar 
sign, but also braces around the variable name: 


somescript /tmp/repS{FN}port.txt 


Discussion 


Because shell variables can contain only alphanumeric characters and the 
underscore, there are many instances where you won’t need to use the braces. 
Any whitespace or punctuation (except the underscore) provides enough of a 
clue to where the variable name ends. But when in doubt, use the braces. In 
fact, some people would argue that always using the braces is a good habit so 
you never have to worry about when they are needed or not, and provides a 
consistent look throughout your scripts. Others find that to be too much 
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typing of characters that are optional but awkward to reach, and think they 
can make the code look very busy or noisy. Ultimately, it’s a matter of 
personal preference. 


See Also 
= Recipe 1.8, “Using Shell Quoting” 


5.0 Exporting Variables 


Problem 


You defined a variable in one script, but when you called another script it 
didn’t know about the variable. 


Solution 


Export variables that you want to pass on to other scripts: 


export MYVAR 
export NAME=value 


Discussion 


Sometimes it’s a good thing that one script doesn’t know about the other 
script’s variables. If you called a shell script from within a for loop in the 
first script, you wouldn’t want the second script messing up the iterations of 
your for loop (which it probably can’t do anyway since it’s almost certainly 
running in a subshell, but work with us here). 


But sometimes you do want the information passed along. In those cases, you 
can export the variable so that its value is passed along to any other program 
that the script invokes. 


If you want to see a list of all the exported variables, just type the command 
env (or use the builtin export -p) fora list of each variable and its value. 
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All of these are available for your script when it runs. Many will have already 
been set up by the bash startup scripts (see Chapter 16 for more on 
configuring and customizing bash). 


You can make the export part of any variable assignment, though that won’t 
work in old versions of the shell. You can also have the export statement 
just name the variable that will be exported. Though the export statement 
can be put anywhere prior to where you need the value to be exported, script 
writers often group these statements together, like variable declarations, at the 
top of a script. 


Once exported, you can assign repeatedly to the variable without exporting it 
each time. So, sometimes you'll see statements like: 


export FNAME 
export SIZE 
export MAX 
MAX=2048 

SIZE=64 
FNAME=/tmp/scratch 


and at other times you’ll see: 


export FNAME=/tmp/scratch 
export SIZE=64 
export MAX=2048 


FNAME=/tmp/scratch2 


FNAME=/tmp/stillexpor ted 


One word of caution: the exported variables are, in effect, call by value. 
Changing the value of the exported variable in the called script does not 
change that variable’s value back in the calling script. 


This begs the question, “How would you pass back a changed value from the 
called script?” Answer: you can’t. 
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You can only design your scripts so that they don’t need to do this. What 
mechanisms have people used to cope with this limitation? 


One approach might be to have the called script echo its changed value as 
output from the script, letting you read the output with the resulting changed 
value. For example, suppose one script exports a variable $VAL and then calls 
another script that modifies $VAL. To get the new value of $VAL in the 
original script, you have to write the changed value to standard output, 
capture it, and assign it to the variable, as in: 


VAL=$(anotherscript) 


(See Recipe 10.5 for an explanation of the $() syntax.) You could even 
change multiple values and echo them each in turn to standard output. The 
calling program could then use a shell read to capture each line of output one 
at a time into the appropriate variables. This requires that the called script 
write no other output to standard output (at least not before or among the 
variables), however, and sets up a very strong interdependency between the 
scripts (not good from a maintenance standpoint). 


See Also 
m help export 


= Chapter 16 for more information on configuring and customizing bash 
= Recipe 5.6, “Seeing All Variable Values” 
m Recipe 10.5, “Using Functions: Parameters and Return Values” 


= Recipe 19.5, “Expecting to Change Exported Variables” 


5.6 Seeing All Variable Values 


Problem 


How can you see which variables have been exported and what values they 


145 


have? Do you have to echo each one by hand? How can you tell if they are 
exported? 


Solution 


Use the set command to see the values of all variables and function 
definitions in the current shell. 


Use the env (or export -p) command to see only those variables that have 
been exported and would be available to a subshell. 


In bash version 4 or newer, you can also use the declare -p command. 


Discussion 


The set command, with no other arguments, produces (on standard output) a 
list of all the shell variables currently defined along with their values, in a 
name=value format. The env command is similar. If you run either, you will 
find a rather long list of variables, many of which you might not recognize. 
Those variables have been created for you, as part of the shell’s startup 
process. 


The list produced by env is a subset of the list produced by set, since not all 
variables are exported. 


If there are particular variables or values that are of interest, and you don’t 
want the entire list, just pipe it into a grep command. For example: 


set | grep MY 


will show only those variables whose name or value has the two-character 
sequence MY somewhere in it. 


The output from the newer declare -p command shows the variable names 
and values as if they were being declared and initialized. Here is a snippet of 
output: 


$ declare -p 
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declare -i MYCOUNT="5" 

declare -x MYENV="10.5.1.2" 

declare -r MYFIXED="unchangeable" 

declare -a MYRA=([O0]J="5" [1]="10" [2]="15") 


$ 


The output is in the form of declare statements that could be used as source 
code in a shell script to recreate these variables and their values. The various 
arguments (-1, -x, -r, -a) indicate that the variable is an integer, has been 
exported, is read-only, or is an array, respectively. 


See Also 


m help set 

=» help export 

= help declare 

m man env 

= Chapter 16 for more on configuring and customizing bash 


= Appendix A for reference lists for all of the built-in shell variables 


5.7 Using Parameters in a Shell Script 


Problem 


You want users to be able to invoke your script with a parameter. You could 
require that users set a shell variable, but that seems clunky. You also need to 
pass data to another script. You could agree on environment variables, but 
that ties the two scripts together too closely. 


Solution 
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Use command-line parameters. Any words on the command line that follow 
the name of a shell script are available to the script as numbered variables. 
Suppose we have the following script, simplest.sh: 


# simple shell script 
echo $1 


The script will echo the first parameter supplied on the command line when it 
is invoked. Here it is in action: 


$ cat simplest.sh 

# simple shell script 

echo ${1} 

$ ./simplest.sh you see what I mean 
you 

$ ./simplest.sh one more time 

one 


$ 


Discussion 


The other parameters are available as ${2}, ${3}, ${4}, ${5}, and so on. 
You don’t need the braces for the single-digit numbers, except to separate the 
variable name from the surrounding text. Typical scripts have only a handful 
of parameters, but when you get to ${10} you need to use the braces because 
the shell will interpret $10 as ${1} followed immediately by the literal string 
0, as we see here: 


$ cat tricky.sh 

echo $1 $10 ${10} 

$ ./tricky.sh I II III IV V VI VII VIII IX X XI 
I I0 X 

$ 


The tenth argument has the value X, but if you write $10 in your script, the 
shell will give you $1, the first parameter, followed immediately by a zero, 
the literal character that you put next to the $1 in your echo statement. 
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See Also 


= Recipe 5.4, “Separating Variable Names from Surrounding Text” 


5.8 Looping Over Arguments Passed to a Script 


Problem 


You want to take some set of actions for a given list of arguments. You could 
write your shell script to perform those actions for one argument and use $1 
to reference the parameter. But what if you’d like to do this for a whole 
bunch of files? You would like to be able to invoke your script like this: 


.factall *.txt 


knowing that the shell will pattern match and build a list of filenames that 
match the *. txt pattern (any filename ending with .txt). 


Solution 


Use the shell special variable $* to refer to all of your arguments, and use 
that in a for loop as in Example 5-2. 


Example 5-2. ch05/chmod_all.1 


#!/usr/bin/env bash 
# cookbook filename: chmod_all.1 
# 
# change permissions on a bunch of files 
# 
for FN in $* 
do 
echo changing $FN 
chmod 0750 $FN 
done 


Discussion 
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The variable $FN is our choice; we could have used any shell variable name 
we wanted there. The $* refers to all the arguments supplied on the command 
line. For example, if the user types: 


./actall abc.txt another.txt allmynotes.txt 


the script will be invoked with $1 equal to abc. txt, $2 equal to another. txt, 
and $3 equal to allmynotes.txt, but $* will be equal to the entire list. In other 


words, after the shell has substituted the list for $* in the for statement, it 
will be as if the script had read: 


for FN in abc.txt another.txt allmynotes.txt 
do 

echo changing $FN 

chmod 0750 SFN 
done 


The for loop will take the first value from the list, assign it to the variable 
SFN, and proceed through the list of statements between the do and the done. 
It will then repeat that loop for each of the other values. 


But you’re not finished yet! This script works fine when filenames have no 
spaces in them, but sometimes you encounter filenames with spaces. Read the 
next two recipes to see how this script can be improved. 


See Also 
m help for 


= Recipe 5.9, “Handling Parameters with Spaces” 
= Recipe 5.10, “Handling Lists of Parameters with Spaces” 
= Recipe 6.12, “Looping with a Count” 


5.9 Handling Parameters with Spaces 
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Problem 


You wrote a script that took a filename as a parameter and it seemed to work, 
but then one time your script failed. The filename, it turns out, had an 
embedded space. 


Solution 


You'll need to be careful to quote any shell parameters that might contain 
filenames. When referring to a variable, put the variable reference inside 
double quotes. 


Discussion 


Thanks a lot, Apple! Trying to be user-friendly, the designers popularized the 
concept of space characters as valid characters in filenames, so users could 
name their files with names like My Report and Our Dept Data instead of the 
ugly and unreadable MyReport and Our_Dept_Data. (How could anyone 
possibly understand what those old-fashioned names meant?) Well, that 
makes life tough for the shell, where the space is the fundamental separator 
between words, so filenames were always kept to a single word. Not so 
anymore. 


So how do we handle this? 


Where a shell script once had simply ls -l $1, it is better to write ls -l 
"$1", with quotes around the parameter. Otherwise, if the parameter has an 
embedded space, it will be parsed into separate words, and only part of the 
name will be in $1. Let’s show you how this doesn’t work: 


$ cat simpls.sh 

# simple shell script 

ls -l ${1} 

$ 

$ ./simple.sh Oh the Waste 

ls: Oh: No such file or directory 


$ 
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If we don’t put quotes around the filename when we invoke the script, bash 
sees three arguments and substitutes the first argument (Oh) for $1. The /s 
command runs with Oh as its only argument and can’t find that file. 


So now let’s put quotes around the filename when we invoke the script: 


$ ./simpls.sh "Oh the Waste" 

ls: Oh: No such file or directory 
ls: the: No such file or directory 
ls: Waste: No such file or directory 


$ 


Still not good. bash has taken the three-word filename and substituted it for 
$1 on the /s command line in our script. So far so good. Since we don’t have 
quotes around the variable reference in our script, however, /s sees each word 
as a Separate argument, (i.e., as separate filenames). Again, it can’t find any 
of them. 


Let’s try a script that quotes the variable reference: 
$ cat quoted.sh 
# note the quotes 
ls -l "${1}" 
$ 
$ ./quoted.sh "Oh the Waste" 


-rw-r--r-- 1 smith users 28470 2007-01-11 19:22 Oh the Waste 
$ 


When we quoted the reference "${1}" it was treated as a single word (a 
single file-name), and the /s then had only one argument—the filename—and 
it could complete its task. 


See Also 


m Chapter 19 for common goofs 
= Recipe 1.8, “Using Shell Quoting” for tips on shell quoting 


m Appendix C for more information on command-line processing 
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5.10 Handling Lists of Parameters with Spaces 


Problem 


OK, you have quotes around your variable as the previous recipe 
recommended. But you’re still getting errors. It’s just like the script from 
Recipe 5.8, but it fails when a file has a space in its name: 


for FN in $* 
do 

chmod 0750 "SFN" 
done 


Solution 


It has to do with the $* in the script, used in the for loop. For this case we 
need to use a different but related shell variable, $@. When it is quoted, the 
resulting list has quotes around each argument separately. The shell script 
should be written as shown in Example 5-3. 


Example 5-3. ch05/chmod_all.2 


#!/usr/bin/env bash 
# cookbook filename: chmod_all.2 
# 
# change permissions on a bunch of files 
# with better quoting in case of filenames with spaces 
# 
for FN in "S@" 
do 
chmod 0750 "SFN" 
done 


Discussion 


The parameter $* expands to the list of arguments supplied to the shell script. 
If you invoke your script like this: 


myscript these are args 
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then $* refers to the three arguments these are args. And when it’s used in 
a for loop, such as: 


for FN in $* 


the first time through the loop $FN is assigned the first word (these), the 
second time the second word (are), etc. 


If the arguments are filenames and they are put on the command line by 
pattern matching, as when you invoke the script this way: 


myscript *.mp3 


then the shell will match all the files in the current directory whose names 
end with the four characters .mp3, and they will be passed to the script. So 
consider an example where there are three MP3 files whose names are: 


vocals.mp3 
cool music.mp3 
tophit.mp3 


The second song title has a space in the filename between cool and music. 
When you invoke the script with: 


myscript *.mp3 
you'll get, in effect: 

myscript vocals.mp3 cool music.mp3 tophit.mp3 
If your script contains the line: 

for FN in $* 


that will expand to: 
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for FN in vocals.mp3 cool music.mp3 tophit.mp3 


which has four words in its list, not three. The second song title has a space 
as the fifth character (cool music.mp3), and the space causes the shell to see 
that as two separate words (cool and music.mp3), so $FN will be cool on the 
second iteration through the for loop. On the third iteration $FN will have the 
value music .mp3, but that is not the name of your file either, so you'll get 
file-not-found error messages. 


It might seem logical to try quoting the $*, but this: 
for FN in "$*" 

will expand to: 
for FN in "vocals.mp3 cool music.mp3 tophit.mp3" 


and you will end up with a single value for $FN equal to the entire list. You'll 
get an error message like this: 


chmod: cannot access 'vocals.mp3 cool music.mp3 tophit.mp3': No such 
file or 
directory 


Instead, you need to use the shell variable $@ and quote it. Left unquoted, $* 
and $& give you the same thing. But when quoted, bash treats them 
differently. A reference to $* inside of quotes gives the entire list inside one 
set of quotes, as we just saw. But a reference to $@ inside of quotes returns 
not one string but a list of quoted strings, one for each argument. 


In our example using the MP3 filenames, this: 
for FN in "$@" 


will expand to: 
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for FN in "vocals.mp3" "cool music.mp3" "tophit.mp3" 


You can see that the second filename is now quoted, so that its space will be 
kept as part of its name and not considered a separator between two words. 


The second time through this loop, $FN will be assigned the value cool 
music.mp3, which has an embedded space. So, be careful how you refer to 
S$FN—you’ll probably want to put it in quotes too, so that the space in the 
filename is kept as part of that string and not used as a separator. That is, 
you'll want to use "$FN", as in: 


chmod 0750 "SFN" 


Shouldn’t you always use "$@" in your for loop? Well, it’s a lot harder to 
type, so for quick-and-dirty scripts, when you know your filenames don’t 
have spaces, it’s probably OK to keep using the old-fashioned $* syntax. For 
more robust scripting though, we recommend "$@" as the safer way to go. 
We’ll probably use them interchangeably throughout this book, because even 
though we know better, old habits die hard—and some of us never use spaces 
in our filenames! (Famous last words.) 


See Also 
= Recipe 5.8, “Looping Over Arguments Passed to a Script” 


= Recipe 5.9, “Handling Parameters with Spaces” 
= Recipe 5.12, “Consuming Arguments” 


= Recipe 6.12, “Looping with a Count” 


5. 11 Counting Arguments 


Problem 


You need to know how many parameters the script was invoked with. 
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Solution 


Use the shell builtin variable ${}. Example 5-4 shows some scripting to 
enforce an exact count of three arguments. 


Example 5-4. ch05/check_arg_count 


#!/usr/bin/env bash 

# cookbook filename: check_arg_count 

# 

# Check for the correct # of arguments: 

# Use this syntax or use: if [ $# -lt 3 ] 

if (( $# < 3 )) 

then 
printf "%b" "Error. Not enough arguments.\n" >&2 
printf "%b" "usage: myscript file1 op file2\n" >&2 
exit 1 

elif (( $# > 3 )) 

then 
printf "%b" "Error. Too many arguments.\n" >&2 
printf "%b" "usage: myscript file1 op file2\n" >&2 
exit 2 

else 
printf "%b" "Argument count correct. Proceeding...\n" 

fi 


And here is what it looks like when we run it, once with too many arguments 


and once with the correct number of arguments: 


$ ./myscript myfile is copied into yourfile 
Error. Too many arguments. 
usage: myscript file1 op file2 


$ ./myscript myfile copy yourfile 


Argument count correct. Proceeding... 


Discussion 


After the opening comments (always a helpful thing to have in a script), we 
have the if test to see whether the number of arguments supplied (found in 
$#) is greater than three. If so, we print an error message, remind the user of 
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the correct usage, and exit. 


The output from the error messages is redirected to standard error. This is in 
keeping with the intent of standard error as the channel for all error messages. 


The script also has a different return value depending on the error that was 
detected. While not that significant here, it is useful for any script that might 
be invoked by other scripts, so that there is a programmatic way not only to 
detect failure (a nonzero exit value), but to distinguish between error types. 


One word of caution: don’t confuse ${#} with ${#VAR} or even ${VAR#alt} 
just because they all use the hash character (#) inside of braces. The first 
gives the number of arguments, whereas the second gives the length of the 
value in the variable VAR and the third does a certain kind of substitution. 


See Also 
= Recipe 4.4, “Telling Whether a Command Succeeded or Not” 


= Recipe 5.1, “Documenting Your Script” 

= Recipe 5.12, “Consuming Arguments” 

= Recipe 5.18, “Changing Pieces of a String” 
= Recipe 6.12, “Looping with a Count” 


5.12 Consuming Arguments 


Problem 


For any serious shell script, you are likely to have two kinds of arguments— 
options that modify the behavior of the script and the real arguments you 
want to work with. You need a way to get rid of the option arguments after 
you’ve processed them. 


For example, you have this script: 


for FN in "S@" 
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do 
echo changing $FN 
chmod 0750 "SFN" 
done 


It’s simple enough—it echoes the filename that it is working on, then it 
changes that file’s permissions. But you want it to work quietly sometimes, 
not echoing the filename. How can you add an option to turn off this verbose 
behavior while preserving the for loop? 


Solution 


Use shift to remove an argument after you’ve handled it, as illustrated in 
Example 5-5. 


Example 5-5. ch05/use_up option 


#!/usr/bin/env bash 
# cookbook filename: use_up option 


# 
# use and consume an option 
# 
# parse the optional argument 
VERBOSE=0 
if [[ $1 = -v ]] 
then 
VERBOSE=1 
shift 
fù 
# 
# the real work is here 
# 
for FN in "S@" 
do 
if (( VERBOSE == 1 )) 
then 
echo changing $FN 
fi 
chmod 0750 "SFN" 
done 
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Discussion 


We add a flag variable, VERBOSE, to tell us whether or not to echo the 
filename as we work. But once the shell script has seen the -v and set the 
flag, we don’t want the -v in the argument list any more. The shift 
statement tells bash to shift its arguments down one position, getting rid of 
the first argument ($1) as $2 becomes $1, $3 becomes $2, and so on. 


That way, when the for loop runs, the list of parameters (in $@) no longer 
contains the -v but starts with the next parameter. 


This approach of parsing arguments is all right for handling a single option, 
but if you want more than one option, you need a bit more logic. By 
convention, options to a shell script should not be dependent on position; e.g., 
myscript -a -p should be the same as myscript -p -a. Moreover, a 
robust script should be able to handle repeated options and either ignore them 
or report an error. For more robust parsing, see the recipe on bash’s getopts 
builtin (Recipe 13.1). 


See Also 

m help shift 

= Recipe 5.8, “Looping Over Arguments Passed to a Script” 
= Recipe 5.11, “Counting Arguments” 

m Recipe 5.12, “Consuming Arguments” 

m Recipe 6.15, “Parsing Command-Line Arguments” 

= Recipe 13.1, “Parsing Arguments for Your Shell Script” 


= Recipe 13.2, “Parsing Arguments with Your Own Error Messages” 


5.13 Getting Default Values 


Problem 
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You have a shell script that takes arguments supplied on the command line. 
You'd like to provide default values so that the most common values can be 
used without the user needing to type them every time. 


Solution 


Use the ${: -} syntax when referring to the parameter, and use it to supply a 
default value: 


FILEDIR=${1: -/tmp} 


Discussion 


There are a series of special operators available when referencing a shell 
variable. The : - operator says that if the specified parameter (here, $1) is not 
set or is null, whatever follows (/tmp in our example) should be used as the 
value. Otherwise, it will use the value that is already set. It can be used on 
any shell variable, not just the positional parameters ($1, $2, $3, etc.), but 
they are probably the most common use. 


Of course, you could do this the long way by constructing an if statement 
and checking to see if the variable is null or unset (we leave that as an 
exercise to the reader), but this sort of thing is so common in shell scripts that 
this syntax has been welcomed as a convenient shorthand. 


See Also 


= The bash manpage on parameter substitution 


= Learning the bash Shell, 3rd Edition, by Cameron Newham (O’ Reilly), 
pages 91—92 


= Classic Shell Scripting by Nelson H. F. Beebe and Arnold Robbins 
(O’Reilly), pages 113—114 


= Recipe 5.14, “Setting Default Values” 
= Table 5-1 
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5.14 Setting Default Values 


Problem 


Your script relies on certain environment variables, either widely used ones 
(e.g., SUSER) or ones specific to your own business. If you want to build a 
robust shell script, you should make sure that these variables each have a 
reasonable value. So how do you guarantee a reasonable default value? 


Solution 


Use the assignment operator in the shell variable reference the first time you 
refer to it to assign a value to the variable if it doesn’t already have one, as in: 


cd ${HOME:=/tmp} 


Discussion 


The reference to $HOME in the example will return the current value of SHOME 
unless it is empty or not set at all. In those cases (empty or not set), it will 
return the value /tmp, which will also be assigned to $HOME so that further 
references to SHOME will have this new value. 


We can see this in action here: 


$ echo ${HOME:=/tmp} 

/home /uid002 

$ unset HOME # generally not wise to do 
$ echo ${HOME:=/tmp} 

/tmp 

$ echo $HOME 

/tmp 

$ cd ; pwd 

/tmp 

$ 


Once we unset the variable, it no longer had any value. When we then used 
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the := operator as part of our reference to it, the new value (/tmp) was 
substituted. The subsequent references to $HOME returned its new value. 


One important exception to keep in mind about the assignment operator: this 
mechanism will not work with positional parameter arguments (e.g., $1 or 
$*). For those cases, use : - in expressions like ${1: -default}, which will 
return the value without trying to do the assignment. 


As an aside, it might help you to remember some of these crazy symbols if 
you think of the visual difference between ${VAR:=value} and ${VAR: - 
value}. The := will do an assignment as well as returning the value to the 
right of the operator. The : - will do half of that—it returns the value but 
doesn’t do the assignment—so its symbol is only half of an equals sign (1.e., 
one horizontal bar, not two). If this doesn’t help, forget that we mentioned it. 


See Also 
= Recipe 5.13, “Getting Default Values” 
= Table 5-1 


5.15 Using null as a Valid Default Value 


Problem 


You need to set a default value, but you want to allow an empty string as a 
valid value. You only want to substitute the default in the case where the 
value is unset. 


The ${:=} operator has two cases where the new value will be used: first, 
when the value of the shell variable has previously not been set (or has been 
explicitly unset); and second, where the value has been set but is empty, as in 
HOME="" or HOME=SOTHER (where SOTHER has no value). 


Solution 
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The shell can distinguish between these two cases, and omitting the colon (: ) 
indicates that you want to make the substitution only if the value is unset. If 
you write only ${HOME=/tmp} without the colon, the assignment will take 
place only in the case where the variable is not set (never set or explicitly 
unset). 


Discussion 


Let’s play with the $HOME variable again, but this time without the colon in 
the operator: 


$ echo S${HOME=/tmp} # no substitution needed 
/home /uid002 

$ HOME="" # generally not wise 

$ echo S{HOME=/tmp} # will NOT substitute 


S unset HOME # generally not wise 

$ echo S${HOME=/tmp} # will substitute 
/tmp 

$ echo $HOME 

/tmp 

$ 


In the case where we simply made the $HOME variable an empty string, the = 
operator didn’t do the substitution since $HOME did have a value, albeit null. 
But when we unset the variable, the substitution occurred. If you want to 
allow for empty strings, use just the = with no colon. Most times, though, the 
:= 1s used because you can do little with an empty value, deliberate or not. 


See Also 


= Recipe 5.13, “Getting Default Values” 
= Recipe 5.14, “Setting Default Values” 
= Table 5-1 
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5. 16 Using More than Just a Constant String 
for Default 


Problem 


You need something more than just a constant string as the default value for a 


variable. 


Solution 
You can use quite a bit more on the righthand side of these shell variable 


references. For example: 


cd ${BASE:="$(pwd)"} 


Discussion 


As the example shows, the value that will be substituted doesn’t have to be 
just a string constant. Rather, it can be the result of a more complex shell 
expression, including running commands in a subshell (as in the example). In 
our example, if SBASE is not set, the shell will run the pwd builtin command 
(to get the current directory) and use the string that it returns as the value. 


So what can you do on the righthand side of this (and the other similar) 
operators? The bash manpage says that what we put to the right of the 
operator “is subject to tilde expansion, parameter expansion, command 
substitution, and arithmetic expansion.” 


Here is what that means: 
= Parameter expansion means that we could use other shell variables in this 


expression, as in ${BASE:=${HOME}}. 


= Tilde expansion means that we can use an expression like ~bob, and it will 
expand that to refer to the home directory of the user bob. Use 
S{BASE:=~uid17} to set the default value to the home directory for user 
uid17, but don’t put quotes around this string, as that will defeat the tilde 
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expansion. 


= Command substitution is what we used in the example; it will run the 
commands and take their output as the value for the variable. Commands 
are enclosed in the single parentheses syntax, $(cmds). 


= Arithmetic expansion means that we can do integer arithmetic, using the 
$(( )) syntax in this expression. Here’s an example: 


echo ${BASE:=/home/uid$((ID+1) )} 


See Also 

=m Table 5-1 

= Recipe 2.15, “Connecting Two Programs by Using Output as Input” 
m Recipe 5.13, “Getting Default Values” 

= Recipe 6.1, “Doing Arithmetic in Your Shell Script” 


5.17 Giving an Error Message for Unset 
Parameters 


Problem 


Those shorthands for giving a default value are cool, but sometimes you need 
to force the users to give you a value; otherwise, you don’t want to proceed. 
Perhaps if they left off a parameter, they don’t really understand how to 
invoke your script. You want to leave nothing to guesswork. Is there anything 
shorter than lots of if statements to check each of your several parameters? 


Solution 


Use the ${:?} syntax when referring to the parameters, as in Example 5-6. 
bash will print an error message and then exit if a parameter is unset or null. 


Example 5-6. ch05/check_unset_parms 
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#!/usr/bin/env bash 

# cookbook filename: check_unset_parms 

# 

USAGE="usage: myscript scratchdir sourcefile conversion" 
FILEDIR=${1:?"Error. You must supply a scratch directory."} 
FILESRC=${2:?"Error. You must supply a source file."} 
CVTTYPE=${3:?"Error. S{USAGE}"} 


Here’s what happens when we run that script with insufficient arguments: 


$ ./myscript /tmp /dev/null 
./myscript: line 7: 3: Error. usage: myscript scratchdir sourcefile 
conversion 


$ 


Discussion 


The check is made to see if each parameter is set (or null); if not, bash will 
print an error message and exit. 


The third variable uses another shell variable in its message. You can even 
run another command inside it: 


CVTTYPE=${3:?"Error. SUSAGE. $(rm $SCRATCHFILE)"} 


If parameter three is not set, then the error message will contain the phrase 
“Error.” along with the value of the variable named $USAGE and then any 
output from the command that removes the file named by the variable 
SSCRATCHFILE. OK, so we’re getting carried away. You can make your shell 
scripts awfully compact, and we do mean awfully. It is better to waste some 
whitespace and a few bytes to make the logic ever so much more readable, as 
in: 


ie ee 

then 
echo "Error. SUSAGE" 
rm SSCRATCHFILE 

fi 
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One other consideration: the error message produced by the ${:?} feature 
comes out with the shell script filename and line number. For example, the 
script fragment in Example 5-6 produces: 


$ ./check_unset_parms 
./check_unset_parms: line 5: 1: Error. You must supply a scratch 
directory. 


$ ./check_unset_parms somedir 
/tmp/check_unset_parms: line 6: 2: Error. You must supply a source 
file. 


$ ./check_unset_parms somedir somefile 

./check_unset_parms: line 7: 3: Error. usage: myscript scratchdir 
sourcefile \ 

conversion 


Because you have no control over this part of the message, and since it looks 
like an error in the shell script itself, combined with the issue of readability, 
this technique is not so popular in commercial-grade shell scripts. (It is handy 
for debugging, though.) 


If you’d rather have this behavior for all variables without having to change 
each one of them, use the set -u command to “treat unset variables as an 
error when substituting”: 


$ echo "$foo" 
$ set -u 

$ echo "$foo" 
bash: foo: unbound variable (3) 
$ echo $? # exit code 
1 

$ set +u 

$ echo "$foo" 

$ echo $? # exit code 
0 

$ 


eo 


oo 


o At first we have the normal behavior. 
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Then we turn on nounset (-u). 

Now we see an error message and a failure exit code. 
We turn nounset (-u) off again... 

And return to the usual behavior. 


oo Oo © 


See Also 

= Recipe 5.13, “Getting Default Values” 

= Recipe 5.14, “Setting Default Values” 

m Recipe 5.16, “Using More than Just a Constant String for Default” 


5.18 Changing Pieces of a String 


Problem 


You want to rename a number of files. The filenames are almost right, but 
they have the wrong suffix. 


Solution 


Use a bash parameter expansion feature that will remove text that matches a 
pattern, as illustrated in Example 5-7. 


Example 5-7. ch05/suffixer 


#!/usr/bin/env bash 

# cookbook filename: suffixer 

# 

# rename files that end in .bad to be .bash 


for FN in *.bad 
do 


mv "S{FN}" "S{FN%bad}bash" 
done 


Discussion 
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The for loop will iterate over a list of filenames in the current directory that 
all end in .bad. The variable $FN will take the value of each name, one at a 
time. Inside the loop, the mv command will rename the file (move it from the 
old name to the new name). We need to put quotes around each filename in 
case the filename contains embedded spaces. 


The crux of this operation is the reference to $FN that includes an automatic 
deletion of the trailing bad characters. The ${ } delimits the reference so that 
the bash adjacent to it is just appended right onto the end of the string. 


Here it is broken down into a few more steps: 


NOBAD="${FN%bad}" 
NEWNAME="${NOBAD}bash" 
mv "S{FN}" "S{NEWNAME}" 


This way you can see the individual steps of stripping off the unwanted 
suffix, creating the new name, and then renaming the files. Putting it all on 
one line isn’t so bad though, once you get used to the special operators. 


Since we are not just removing a substring from the variable but are replacing 
the bad with bash, we might have used the substitution operator for variable 
references, the slash (/). Similar to editor commands (e.g., those found in vi 
and sed) that use the slash to delimit substitutions, we could have written: 


# Not anchored, don't do this 
mv "S{FN}" "S{FN/bad/bash}" 


(Unlike with the editor commands, you don’t use a final slash—the righthand 
brace serves that function.) 


However, one reason that we didn’t do it this way is because the substitution 
isn’t anchored, and can be made anywhere in the variable. If, for example, we 
had a file named subaddon.bad the substitution would leave us with 
subashdon.bad, which is not what we want. If we used a double slash in place 
of the first slash, 1t would substitute every occurrence within the variable. 
That would result in subashdon.bash, which isn’t what we want either. This 
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is better: 


# Add the "." to "anchor" the pattern; this is better, but not 


foolproof 
mv "S{FN}" "S{FN/.bad/.bash}" 


The ${FN%bad}bash we used in our solution is already anchored—tt will 
only remove the text from the end of the string, which in this case is exactly 
what we want. 


There are several operators that do various sorts of manipulation on the string 
values of variables when referenced. Table 5-1 summarizes them. 
Table 5-1. String manipulation operators 
Inside ${ = } Action taken 


name:number:number Return a substring of name starting at number with length 


number 
#name Return length of string 
name#pattern Remove (shortest) front-anchored pattern 
nane##pattern Remove (longest) front-anchored pattern 
name%pattern Remove (shortest) rear-anchored pattern 
name%%pattern Remove (longest) rear-anchored pattern 


name/pattern/string Replace first occurrence 


nane//pattern/string Replace all occurrences 


Try them all. They are very handy. 


See Also 


m man rename 
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= Recipe 12.5, “Comparing Two Documents” 


= Recipe 13.10, “Taking It One Character at a Time” for more with 
substrings 


= Recipe 13.15, “Trimming Whitespace” 
= Recipe 5.19, “Getting the Absolute Value of a Number” 


5.19 Getting the Absolute Value of a Number 


Problem 


You have a numeric value in a variable, but it may be negative, zero, or 
positive. You would like to get its magnitude—that is, its absolute value—but 
bash doesn’t seem to have an absolute value function. 


Solution 


Use string manipulation: 


${MYVAR#- } 


Discussion 


This is simple string manipulation. The # searches from the front of the 
string, looking for, in this case, the minus sign (-). If found, it will remove it. 
If no minus is found, it simply results in the original value. Either way, that 
leaves the value without a leading minus, which gives us its magnitude; 1.e., 
its absolute value. 


You could use if /then/else logic as a mathematically oriented approach: 


# why bother? 
if (( MYVAR < 0 )) 
then 

let MYVAR=MYVAR# - 1 
fi 
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but as the comment says, why bother? The string manipulation technique is 
short and sweet. You may want to comment it for readability, though: 


MYVAR=$ {MYVAR#- } # ABS(MYVAR) 
See Also 
= Recipe 5.18, “Changing Pieces of a String” 
m Table 5-1 


5.20 Using bash for basename 


Problem 


The basename command does what you want, but can you get the same result 
without calling an external executable? Is bash string manipulation able to do 
that? 


Solution 


Yes, bash can strip the directory path from a shell variable string and leave 
just the last part of the path (the filename). Where you may want to write: 


FILE=$(basename SFULLPATHTOFILE) 
instead you need only write: 


FILE=${FULLPATHTOFILE##* / } 


Discussion 


The big difference between the first and second examples is the braces. The 
first example, using parentheses, will launch a subshell to run the executable 
basename with the argument that is the value of SFULLPATHTOFILE (the old 
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way of doing this was ``). The second example uses curly braces, which is 
just part of the syntax for evaluating a shell variable—no subshell, no 
executable file. It looks for, and removes from the front of the string (because 
of the #), the longest match (because of the double ##) of the pattern 
described by the asterisk and the slash (*/). The asterisk matches any number 
of characters and the slash is just a literal slash. In the string 
/usr/local/bin/mycmd, that pattern will match (and thus remove) the 
/usr/local/bin/ part of the string, leaving mycmd as the value to be 
assigned into the variable $FILE. 


WARNING 


The basename command will ignore a trailing slash in the path, so $(basename 
/usr/local/bin/) returns bin whereas our bash version would return an 
empty string (since the largest pattern to end in a slash is the whole string). To 
be compatible, we should remove any trailing slash first before the other 
substitutions. 


The real basename command can also take a suffix to be removed as a second 
argument. In bash we can do that, too, but would need to do it in a separate 
step. So, a more complete replacement for: 


FILE=$(basename SMYIMAGEFILE .jpg) 


would be: 


FILE=S${MYIMAGEFILE%/} # remove a trailing slash 


FILE=${FILE##* /} # remove all chars up to last / 
FILE=${FILE%. jpg} # remove .jpg suffix if present 
See Also 


=m man basename 


= Recipe 5.18, “Changing Pieces of a String” 
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= Recipe 5.21, “Using bash for dirname” 
= Table 5-1 


5.21 Using bash for dirname 


Problem 


The dirname command does what you want, but like basename, it causes a 
separate executable to be launched and run in a subshell. Can you use string 
manipulations instead? 


Solution 

Yes. Use a string manipulation operator to remove the filename—the last part 
of a path in a string—leaving as much of the directory path to that filename 
as was in the string: 


DIR=${MYPATHTOFILE%/*} 


Discussion 


If the variable holds /usr /Local/bin/mycmd, we want the result of this 
manipulation to give us just /usr/Local/bin and drop the last part (the 
filename). Since each piece of the path is separated by a slash, we just 
remove from the righthand side (because of the %) the shortest string 
(because there is only one %, not two) that matches the pattern “a slash 
followed by any number of characters” (/*). 


WARNING 


This example is for illustrating the string manipulations on shell variables. It 
does not provide a complete, compatible replacement for the dirname 
command, especially around the edge cases of any path that ends with a slash. 


X 
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See Also 


= man dirname for other options and subtle differences 
= Recipe 5.18, “Changing Pieces of a String” 

= Recipe 5.20, “Using bash for basename” 

= Table 5-1 


5.22 Using Alternate Values for Comma 
Separated Values 


Problem 


You want to make a list of values separated by commas, but you don’t want a 
leading or trailing comma. 


Solution 


If you write LIST="${LIST},${NEWVAL}" inside a loop to build up the list, 
then the first time (when LIST is null) you’Il end up with a leading comma. 
You could special-case the initialization of LIST so that it gets the first 
element before entering the loop, but if that’s not practical, or to avoid 
duplicate code (for getting a new value), you can instead use the ${:+} 
syntax in bash: 


LIST="S{LIST}S{LIST:+, }S{NEWVAL}" 


If ${LIST} is null or unset, then both expressions of $LIST are replaced with 
nothing. That means that the first time through the loop LIST will be assigned 
NEWVAL’s value and nothing more. When LIST is not null, the second 
expression (${LIST:+, }) is replaced with a comma, separating the previous 
value from the new value. 


Here is an example code segment for reading and constructing a CSV list: 
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# 
# read names one at a time 
# and build a comma-separated list 
# 
while read NEWVAL 
do 
LIST="S{LIST}S{LIST:+, }S{NEWVAL}" 
done 
echo $LIST 


See Also 
= Recipe 5.18, “Changing Pieces of a String’ 
= Table 5-1 
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5.23 Using Array Variables 


Problem 


You’ve seen plenty of scripts so far with variables, but can bash deal with an 
array of variables? 


Solution 


Yes. bash has an array syntax for single-dimension arrays. 


Discussion 


Arrays are easy to initialize if you know the values as you write the script. 
The format is simple: 


MYRA=(first second third home) 


Each element of the array is a separate word in the list enclosed in 
parentheses. Then you can refer to each this way: 
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echo runners on S${MYRA[O]} and S{MYRA[2]} 
This output is the result: 
runners on first and third 


If you write only $MYRA, you will get only the first element, just as if you had 
written ${MYRA[ 0] }. 


See Also 


= Learning the bash Shell, 3rd Edition, by Cameron Newham (O’Reilly), 
pages 157—161, for more information about arrays 


= Recipe 7.15, “Counting String Values with bash” for another type of array 
in bash, associative arrays 


= Recipe 13.4, “Parsing Output into an Array” 


5.24 Converting Between Upper- and Lowercase 


Problem 


Your digital camera left you with a pile of files all named in uppercase, like 
IMG0001.JPG. You want the names in lowercase, but don’t want to have to 
retype each name. 


Solution 


As of bash 4.0 there are a few operators to do case conversion when 
referencing a variable name. If $FN is the variable in which you put a 
filename (i.e., string) that you want converted to lowercase, then ${FN, , } 
will return that string in all lowercase. Similarly, ${FN**} will return the 
string in all uppercase. There is even the ${FN~~} operator to swap case, 
changing all lower- to upper- and all upper- to lowercase characters (but why 
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would you want to do that?). 


Here is a for loop that will rename all the .JPG files to lowercase names: 


for FN in *.JPG 
do 

mv "SFN" "S{FN,,}" 
done 


or as a one-liner: 
for FN in *.JPG; do mv "SFN" "S{FN,,}" ; done 


There is another approach, also available in version 4 of bash or newer: you 
can declare your variable to be a type that is always lowercase. Any text 
assigned to it will be converted to lowercase. Using that approach our for 
loop to rename files just does a simple assignment rather than requiring a 
string substitution operator: 


declare -l lcfn # contents will be converted to Lowercase 
for FN in *.JPG 
do 
Lefn="SFN" 
mv "SFN" "Slcfn" 
done 


There are similar declarations for variables that change the case of all letters 
or only the first letter. Here’s a simple demonstration program to show how 
they work: 


declare -u UP # all UPPERCASE 
declare -l dn # all lowercase 
declare -c Ca # only the first Uppercase 


while read TXT 
do 
UP="S{TXT}" 
dn="S{TXT}" 
Ca="S{TXT}" 
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echo $TXT SUP S$dn_ S$Ca 
done 


In the case of the variable declared with -c, only the first letter 1s capitalized 
even if there are multiple words in the string. Try running it and see how it 
works. 


See Also 


= man rename 


= Recipe 5.25, “Converting to Camel Case” 


5.25 Converting to Camel Case 


Problem 


You want each word to begin with a capital letter, not just the first letter of 
the string. 


Solution 


Use a combination of an array and case conversion substitution: 


while read TXT 

do 
RA=(STXT ) # must be ($ not $( 
echo ${RA[@]*} 


done 


Discussion 


The parentheses around $TXT cause it to be treated as array initialization. 
Whitespace separating the words in the text delineates the array elements. 
The [@] notation references all the elements of the array at once 
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(individually), and the ^ operator converts the first character (of each 
element) to uppercase. 


See Also 


= Recipe 5.24, “Converting Between Upper- and Lowercase” 
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Chapter 6. Shell Logic and 
Arithmetic 


One of the big improvements in modern versions of bash compared with the 
original Bourne shell is in the area of arithmetic. Early versions of the shell 
had no built-in arithmetic; it had to be done by invoking a separate 
executable, even just to add 1 to a variable. In a way it’s a tribute to how 
useful and powerful the shell was (and is) that it could be used for so many 
tasks despite that awful mechanism for arithmetic. After a while, though, it 
became clear that simple, straightforward syntax was needed for the simple 
counting useful for automating repetitive tasks. The lack of such capability in 
the original Bourne shell contributed to the success of the C shell (csh) when 
it introduced C-like syntax for shell programming, including numeric 
variables. Well, that was then and this is now. If you haven’t looked at shell 
arithmetic in bash for a while, you’re in for a big surprise. 

Beyond arithmetic, there are the control structures familiar to any 
programmer. There is an if /then/else construct for decision making, as 
well as while loops and for loops, though you will see some bash 
peculiarities to all of these. There is a case statement made quite powerful by 
its string pattern matching, and an odd construct called select. After 


discussing these features, we will end the chapter by using them to build two 
simple command-line calculators. 


6.1 Doing Arithmetic in Your Shell Script 


Problem 


You need to do some simple arithmetic in your shell script. 
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Solution 


Use $(( )) or let for integer arithmetic expressions. For example: 


COUNT=$((COUNT + 5 + MAX * 2)) 
let COUNT+='5+MAX*2' 


Discussion 


As long as you keep to integer arithmetic, you can use all the standard (i.e., 
C-like) operators inside of $(( )) for arithmetic expressions. There is one 
additional operator, too: you can use ** for raising to a power, as in 
MAX=$((2**8)), which yields 256. 

Spaces are not needed, nor are they prohibited around operators and 
arguments (though ** must be together) within a $(( )) expression. But you 
must not have spaces around the equals sign, as with any bash variable 


assignment. Also, be sure to quote Let expressions since the Let statement is 
a bash builtin and its arguments will undergo word expansion. 


WARNING 


Do not put spaces around the equals sign of an assignment! If you write: 
COUNT = $((COUNT+5)) # not what you think! 
then bash will try to run a program named COUNT with its first argument an 


equals sign and its second argument the number you get by adding 5 to the 
value of SCOUNT. Remember: no spaces around the assignment equals sign! 


Another oddity to these expressions is that the $ that we normally put in front 
of a shell variable to say we want its value (as in SCOUNT or $MAX) is not 
needed inside the double parentheses. For example, we can write: 


S((COUNT + 5 + MAX * 2)) 
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without including the dollar sign on the shell variables—in effect, the outer $ 
applies to the entire expression. We do need the dollar sign, though, if we are 
using a positional parameter (e.g., $2), to distinguish it from a numeric 
constant (e.g., 2). Here’s an example: 


COUNT=$((COUNT + $2 + OFFSET)) 


There is a similar mechanism for integer arithmetic with shell variables using 
the bash builtin Let statement. It uses the same arithmetic operators as the 
$(( )) construct: 


let COUNT=COUNT+5 


When using let there are some fancy assignment operators we can use, such 
as these (which will accomplish the same thing as the previous line): 


let COUNT+=5 


(These should look familiar to programmers of C/C++ and Java.) This 


example adds five to the previous value of COUNT without our having to 
repeat the variable name. 


Table 6-1 shows a list of those special assignment operators. 


Table 6-1. Explanation of assignment operators in bash 


Operator Operation with assignment Use Meaning 
= Simple assignment a=b a=b 

c= Multiplication a*=b a=(a*b) 
/= Division a/=b a=(a/b) 
%= Remainder a%=b a=(a%b) 
+= Addition a+=b a=(a+b) 
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as Subtraction a-=b a=(a-b) 


<<= Bit-shift left a<<=b a=(a<<b) 
>>= Bit-shift right a>>=b a=(a>>b) 
&= Bitwise “and” a&=b a=(a&b) 
^= Bitwise “exclusive or” a^=b a=(a^b) 
\ = Bitwise “or” aļ=b 


These assignment operators are also available with $(( )) provided they 
occur inside the double parentheses. The outermost assignment is still just 
plain old shell variable assignment. 


The assignments can also be cascaded, through the use of the comma 
operator: 


echo $(( X+=5 , Y*=3 )) 


which will do both assignments and then echo the result of the second 
expression (since the comma operator returns the value of its second 
operand). If you don’t want to echo the result, the more common usage would 
be with the let statement: 


let X+=5 Y*=3 


The comma operator is not needed here, as each word of a let statement is 
its own arithmetic expression. 

One other important difference between the Let statement and the $(( )) 
syntax is how they handle whitespace (1.e., the space character). The let 
statement either requires quotes or that there be no spaces around not only the 
assignment operator (the equals sign), but any of the other operators as well; 
it must all be packed together into a single word. These both work: 


let 1=2+2 
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let "i = 2 + 2" 


The $(( )) syntax, however, can be much more generous, allowing all sorts 
of whitespace within the parentheses. For that reason, it is less prone to errors 
and makes the code much more readable, and is, therefore our preferred way 
of doing bash integer arithmetic. However, an exception can be made for the 
occasional += assignment or ++ operator, or when we get nostalgic for the 
early days of BASIC programming (which had a LET statement). 


WARNING 
Remember that this is integer arithmetic, not floating point. Don’t expect much | 


out of an expression like 2/3, which in integer arithmetic evaluates to O (zero). 
The division is integer division, which will truncate any fractional result. 


See Also 


m help let 
= The bash manpage 


6.2 Branching on Conditions 


Problem 


You want to check if you have the right number of arguments and take 
actions accordingly. You need a branching construct. 


Solution 


The if statement in bash is similar in appearance to that in other 
programming languages: 


if [ $# -lt 3 ] 
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then 
printf "%b" "Error. Not enough arguments.\n" 
printf "%b" "usage: myscript file1 op file2\n" 
exit 1 

fi 


or alternatively: 


if (( $# < 3 )) 

then 
printf "%b" "Error. Not enough arguments. \n" 
printf "%b" "usage: myscript file1 op file2\n" 
exit 1 

fi 


Here’s a full-blown if with an elif (bash-talk for else-if) and an else 
clause: 


if (( $# < 3 )) 

then 
printf "%b" "Error. Not enough arguments. \n" 
printf "%b" "usage: myscript file1 op file2\n" 
exit 1 

elif (( $# > 3 )) 

then 
printf "%b" "Error. Too many arguments. \n" 
printf "%b" "usage: myscript file1 op file2\n" 
exit 2 

else 
printf "%b" "Argument count correct. Proceeding... \n" 

fi 


You can even do things like this: 


[ $result = 1 ] \ 
&& { echo "Result is 1; excellent." ; exit 0; Y 
|| { echo "Uh-oh, ummm, RUN AWAY! " ; exit 120; } 


(For a discussion of this last example, see Recipe 2.14.) 
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Discussion 


We have two things we need to discuss: the basic structure of the if 
statement and how it is that we have different syntax (parentheses or 
brackets, operators or options) for the if expression. The first may help 
explain the second. The general form for an if statement, from the manpage 
for bash, 1s: 


if list; then list; [ elif list; then list; ] ... [ else list; ] fi 


The [ and ] are used to delineate optional parts of the statement (e.g., some 
if statements have no else clause). So let’s look for a moment at the if 
without any optional elements. 


The simplest form for an if statement would be: 


if list; then list; fi 


TIP 


In bash, the semicolon serves the same purpose as a newline—it ends a 
statement. We could have crammed the examples in the Solution section onto 
fewer lines by using semicolons, but it is more readable to use newlines. 


The then list seems to make sense—it’s the statement or statements that 
will execute provided that the if condition is true (or so we would surmise 
from other programming languages). But what’s with the if List? Wouldn’t 
you expect it to be if expression? 


You might, except that this is a shell—a command processor. Its primary 
operation is to execute commands. So, the list after the if 1s a place where 
you can put a list of commands. What, you ask, will be used to determine the 
branching—the alternate paths of the then or the else? It will be determined 
by the return value of the last command in the list. (The return value, you 
might remember, is also available as the value of the $? variable.) 
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Let’s take a somewhat strange example to make this point: 


$ cat trythis.sh 
if ls; pwd; cd $1; 
then 

echo success 
else 

echo failed 
fi 
pwd 


$ bash ./trythis.sh /tmp 
$ bash ./trythis.sh /nonexistent 


$ 


In this strange script, the shell will execute three commands (an /s, a pwd, and 
a cd) before doing any branching. The argument to the cd in this example is 
the first argument supplied on the shell script invocation. If there is no 
argument supplied, it will just execute cd, which returns you to your home 
directory. 

So what happens? Try it yourself and find out. The result showing “success” 
or “failed” will depend on whether or not the cd command succeeds. In our 
example, the cd is the last command in the if list of commands. If the cd 
fails, the else clause is taken, but if it succeeds, the then clause is taken. 


Properly written commands and builtins return a value of 0 (zero) when they 
encounter no errors in their execution. If they detect a problem (e.g., bad 
parameters, I/O errors, file not found), they will return some nonzero value 
(often a different value for each different kind of error they detect). 


This is why it is important for both shell script writers and C (and other 
language) programmers to be sure to return sensible values upon exiting from 
their scripts and programs. Someone’s if statement may be depending on it! 


OK, so how do we get from this strange if construct to something that looks 
like a real if statement—the kind that you are used to seeing in programs? 
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What’s going on with the examples that began this recipe? After all, they 
don’t look like lists of statements. 


Let’s try this on for size: 


if test $# -lt 3 
then 

echo try again. 
fi 


Do you see something that looks like, if not an entire list, then at least a 
single shell command—the builtin command test, which will take its 
arguments and compare their values? The test command will return a 0 if true 
or a 1 otherwise. To see this yourself, try the test command on a line by itself, 
and then echo $? to see its return value. 


The first example we gave that began if [ $# -lt 3 ] looks a lot like the 
test statement. That’s because the [ is actually just a different name for the 
same command. (When invoked with the name [ it also requires a trailing ] 

as the last parameter, for readability and aesthetic reasons.) So that explains 
the first syntax—the expression in the if statement is actually a list of only 

one command, a test command. 


TIP 


In the early days of Unix, test was its own separate executable and [ was just a 
link to the same executable. They still exist as executables, but bash 
implements them as a builtin command. 


Now what about the if (( $# < 3 )) expression in our list of examples in 
the Solution section? The double parentheses are one of several types of 
compound commands. This kind is useful for if statements because it 
performs an arithmetic evaluation of the expression between the double 
parentheses. This is a more recent bash improvement, added for just such an 
occasion as its use in if statements. 
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The important distinctions to make with the two kinds of syntax that can be 
used with the if statement are the ways to express the tests, and the kinds of 
things for which they test. The double parentheses are strictly for arithmetic 
expressions. The square brackets can also test for file characteristics, but the 
syntax is much less streamlined for arithmetic expressions. This is 
particularly true if you need to group larger expressions with parentheses 
(which need to be quoted or escaped when using square brackets). 


See Also 

m help if 

m help test 

m man test 

= Recipe 2.14, “Saving or Grouping Output from Several Commands” 
= Recipe 4.4, “Telling Whether a Command Succeeded or Not” 

= Recipe 6.3, “Testing for File Characteristics” 

= Recipe 6.5, “Testing for String Characteristics” 


= Recipe 15.11, “Getting Input from Another Machine” 


Go Testing for File Characteristics 


Problem 


You want to make your script robust by checking to see if your input file is 
there before reading from it; you would also like to see if your output file has 
write permissions before writing to it and you would like to see if there is a 
directory there before you attempt to cd into it. How do you do all that in 
bash scripts? 


Solution 
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Use the various file characteristic tests in the test command as part of your if 
statements. Your specific problems might be solved with scripting that looks 
something like Example 6-1. 


Example 6-1. ch06/checkfile 


#!/usr/bin/env bash 

# cookbook filename: checkfile 
# 

DIRPLACE=/tmp 
INFILE=/home/yucca/amazing.data 
OUTFILE=/home/yucca/more.results 


if [ -d "SDIRPLACE" ] 
then 

cd SDIRPLACE 

if [ -e "SINFILE" ] 


then 
if [ -w "SOUTFILE" ] 
then 
doscience < "SINFILE" >> "SOUTFILE" 
else 
echo "cannot write to SOUTFILE" 
fi 
else 
echo "cannot read from SINFILE" 
fi 
else 
echo "cannot cd into $DIRPLACE" 
fi 
Discussion 


We put all the references to the various filenames in quotes in case they have 
any embedded spaces in the pathnames. There are none in this example, but if 
you change the script you might use other pathnames. 


We tested and executed the cd before we tested the other two conditions. In 
this example it wouldn’t matter, but if SINFILE or SOUTFILE were relative 
pathnames (not beginning from the root of the filesystem, 1.e., with a leading 
/), then the test might evaluate to true before the cd and not after, or vice 
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versa. This way, we test right before we use the files. 


We use the double-greater-than operator (>>) to concatenate output onto our 
results file, rather than replacing the old content with the new content. 


The several tests could be combined into one large if statement using the -a 
(read “and”) operator, but then if a test failed you couldn’t give a very helpful 
error message since you wouldn’t know which test didn’t pass. 


There are several other characteristics for which you can test. Three of them 
are tested using binary operators, each taking two filenames: 


FILE1 -nt FILE2 
Is newer than (it checks the modification date). An existing file is 
considered “newer” than one that doesn’t exist. 

FILE1 -ot FILE2 
Is older than; also, a file that doesn’t exist is considered older than one 
that does. 

FILE1 -ef FILE2 


Have the same device and inode numbers (identical files, even if pointed 
to by different links) 


Table 6-2 shows the other tests related to files (see “Test Operators” in 
Appendix A for a more complete list). They all are unary operators, taking 
the form option filenameasinif [ -e myfile ]. 


Table 6-2. Unary operators that check file 
characteristics 


Option Description 


-b File is a block special device (for files like /dev/hda1) 
-C File is character special (for files like /dev/tty) 
-d File is a directory 
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-e File exists 


-f File is a regular file 
-g File has its set-group-ID (setgid) bit set 
-h File is a symbolic link (same as -L) 
-G File is owned by the effective group ID 
-k File has its sticky bit set 
-L File is a symbolic link (same as -h) 
-N File has been modified since it was last read 
-0 File is owned by the effective user ID 
-p File is a named pipe 
-r File is readable 
-S File has a size greater than zero 
-S File is a socket 
-u File has its set-user-ID (setuid) bit set 
-W File is writable 
-X File is executable 

See Also 


= Recipe 2.10, “Appending Rather than Clobbering Output” 
= Recipe 4.6, “Using Fewer if Statements” 


= “Test Operators” in Appendix A 


6.4 Testing for More than One Thing 
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Problem 


What if you want to test for more than one characteristic? Do you have to 
nest your if statements? 


Solution 


Use the operators for logical AND (-a) and OR (-0) to combine more than 
one test in an expression. For example: 


if [ -r $FILE -a -w $FILE ] 


will test to see that the file is both readable and writable. 


Discussion 


All the file test conditions include an implicit test for existence, so you don’t 
need to test if a file exists and is readable. It won’t be readable if it doesn’t 
exist. 


These conjunctions (-a for AND and -o for OR) can be used for all the 
various test conditions. They aren’t limited to just the file conditions. 


You can make several AND/OR conjunctions in one statement. You might 
need to use parentheses to get the proper precedence, as ina and (b or c), 
but if you use parentheses, be sure to escape their special meaning from the 
shell by putting a backslash before each or by quoting each parenthesis. 
Don’t try to quote the entire expression in one set of quotes, however, as that 
will make your entire expression a single term that will be treated as a test for 
an empty string (see Recipe 6.5). 


Here’s an example of a more complex test with the parentheses properly 
escaped: 


if [ -r "SEN" -a \( -f "SEN" -o -p "SEN" \) ] 


Don’t make the assumption that these expressions are evaluated in quite the 
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same order as in Java or C. In C and Java, if the first part of the AND 
expression is false (or the first part true in an OR expression), the second part 
of the expression won’t be evaluated (we say the expression short-circuits). 
However, because the shell makes multiple passes over the statement while 
preparing it for evaluation (e.g., doing parameter substitution, etc.), both parts 
of the joined condition may have been partially evaluated. While it doesn’t 
matter in this simple example, in more complicated situations it might. For 
example: 


if [ -z "$V1" -o -z "${V2:=YIKES}" ] 


Even if $V1 is empty, satisfying enough of the if statement that the second 
part of the condition (checking if $V2 is empty) need not occur, the value of 
$V2 may have already been modified (as a side effect of the parameter 
substitution for $V2). The parameter substitution step occurs before the -z 
tests are made. Confused? Don’t be...just don’t count on short circuits in 
your conditionals. If you need that kind of behavior, just break the if 
statement into two nested if statements or use && and | |. 


See Also 


= Recipe 6.5, “Testing for String Characteristics” 


=» Appendix C for more on command-line processing 


6.5 Testing for String Characteristics 


Problem 


You want your script to check the values of some strings before using them. 
The strings could be user input, read from a file, or environment variables 
passed to your script. How do you do that with bash scripts? 


Solution 
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There are some simple tests that you can do with the builtin test command, 
using the single-bracket if statements. You can check to see whether a 
variable has any text, and you can check to see whether two variables are 
equal as strings. 


Discussion 
Take a look at Example 6-2. 


Example 6-2. ch06/checkstr 


#!/usr/bin/env bash 
# cookbook filename: checkstr 


# 
# if statement 

# test a string to see if it has any Length 
# 

# 


use the command-line argument 

VAR="$1" 
# 
# if [ "SVAR" ] will usually work but is bad form, using -n is more clear 
if [ -n "SVAR" ] 
then 

echo has text 
else 

echo zero length 
fi 
# 
if [ -z "SVAR" ] 
then 

echo zero length 
else 

echo has text 
fù 


We use the phrase “has any length” deliberately. There are two types of 
variables that will have no length—those that have been set to an empty 
string and those that have not been set at all. This test does not distinguish 


between those two cases. All it asks is whether there are some characters in 
the variable. 
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It is important to put quotes around the "S$VAR" expression because without 
them your syntax could be disturbed by odd user input. If the value of $VAR 
were xX -a 7 -lt 5 and if there were no quotes around the $VAR, then the 
expression: 


if [ -z SVAR ] 
would become (after variable substitution): 
if [ -z x -a7-lt 5 ] 


which is legitimate syntax for a more elaborate test, but one that will yield a 
result that is not what you wanted (1.e., one not based on whether the string 
has characters). 


See Also 
= Recipe 6.7, “Testing with Pattern Matches” 


= Recipe 6.8, “Testing with Regular Expressions” 
= Recipe 14.2, “Avoiding Interpreter Spoofing” 
a “Test Operators” in Appendix A 


6.6 Testing for Equality 


Problem 


You want to check to see if two shell variables are equal, but there are two 
different test operators: -eq and = (or ==). Which one should you use? 


Solution 


The type of comparison you need determines which operator you should use. 
Use the -eq operator for numeric comparisons and the equality primary = (or 
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==) for string comparisons. 


Discussion 
Example 6-3 1s a simple script to illustrate the situation. 


Example 6-3. ch06/strvsnum 


#!/usr/bin/env bash 

# cookbook filename: strvsnum 

# 

# the old string vs. numeric comparison dilemma 
# 

VAR1=" 05 " 

VAR2Z="5" 


printf "%s" "do they -eq as equal? " 
if [ "SVAR1" -eq "SVAR2" ] 
then 
echo YES 
else 
echo NO 
fi 


printf "%s" "do they = as equal? " 
if [ “SVAR1" = "SVAR2" ] 
then 
echo YES 
else 
echo NO 
fi 


When we run the script, here is what we get: 


$ bash strvsnum 

do they -eq as equal? YES 
do they = as equal? NO 

$ 


While the numeric value is the same (5) for both variables, characters such as 
leading zeros and whitespace can mean that the strings are not equal as 
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strings. 


Both = and == are accepted, but the single equals sign follows the POSIX 
standard and is more portable. 


It may help you to remember which comparison to use if you recognize that 
the -eq operator is similar to the FORTRAN .eq. operator. (FORTRAN is a 
very numbers-oriented language, used for scientific computation.) In fact, 
there are several numerical comparison operators in bash, each similar to an 
old FORTRAN operator. The abbreviations, all listed in Table 6-3, are rather 
mnemonic and easy to figure out. 


Another way to remember which to use is that it feels “backward” or 
“opposite”: the string-like comparators (the syntax using characters; e.g., - 
eq) are for numbers and the numeric-looking comparators (e.g., the math-like 
+<=+) are for strings. 


Table 6-3. bash’s comparison 
operators 


Numeric String Meaning 


-lt < Less than 

-le <= Less than or equal to 
-gt > Greater than 

-ge >= Greater than or equal to 
-eq === Equal to 

-ne l= Not equal to 


This is the opposite of Perl, in which eq, ne, etc. are the string operators, 
while ==, !=, etc. are numeric. 


Maybe the best solution is to always do your numerical tests with the double- 
parentheses syntax and your string comparisons with the double-square- 
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brackets syntax. Then you can always use the math-style symbols for 
comparison. 


See Also 

= Recipe 6.7, “Testing with Pattern Matches” 

= Recipe 6.8, “Testing with Regular Expressions” 
= Recipe 14.12, “Validating Input” 

= “Test Operators” in Appendix A 


6.7 Testing with Pattern Matches 


Problem 


You want to test a string not for a literal match, but to see if it fits a pattern. 
For example, you want to know if a file is named like a JPEG file might be 
named. 


Solution 


Use the double-bracket compound statement in an if statement to enable 
shell-style pattern matches on the righthand side of the equality operator: 


if [[ "S{MYFILENAME}" == *.jpg ]] 


Discussion 


The double-bracket syntax is not the old-fashioned [ of the test command, 
but a newer bash mechanism (available since v2.01 or so). It uses the same 
operators that work with the single-bracket form, but in the double-bracket 
syntax the equals sign is a more powerful string comparator. You can use a 
single or a double equals sign, as we have used here; they are the same 
semantically. We prefer to use the double equals sign (especially when doing 
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pattern matching) to emphasize the difference, but it is not the reason that we 
get pattern matching—that comes from the double-bracket compound 
statement. 


The standard pattern matching includes the * to match any number of 
characters, the question mark (?) to match a single character, and brackets 
([]) for including a list of possible characters. Note that these resemble shell 
file wildcards, and are not regular expressions. 


Don’t put quotes around the pattern if you want it to behave as a pattern. If 
our string had been quoted, it would have only matched strings with a literal 
asterisk as the first character. 


There are more powerful pattern-matching capabilities available by turning 
on some additional options in bash. Let’s expand our example to look for 
filenames that end in either .jpg or .jpeg. We can do that with this bit of code: 


shopt -s extglob 
if [[ "SFN" == *.@(jpg|jpeg) ]] 
then 

# and so on 


The shopt -s command is the way to turn on shell options. The extglob 
option deals with extended pattern matching (or globbing). With this 
extended pattern matching we can have several patterns, separated by the | 
character and grouped by parentheses. The first character preceding the 
parentheses says whether the list should match just one occurrence of a 
pattern in the list (using a leading @) or some other criteria. Table 6-4 lists the 
possibilities. 


Table 6-4. Grouping symbols for 
extended pattern matching 
Grouping Meaning 


@...) Only one occurrence 


E) Zero or more occurrences 
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+H...) One or more occurrences 
AT) Zero or one occurrence 


I...) Not this, but anything else 


Matches are case-sensitive, but you may use shopt -s nocasematch (in 
bash versions 3.1+) to change that. This option affects case and [[ 
commands. 


See Also 


m Recipe 14.2, “Avoiding Interpreter Spoofing” 

= Recipe 16.9, “Adjusting Shell Behavior and Environment” 

= “shopt Options” in Appendix A 

= “Pattern-Matching Characters” in Appendix A 

= “extglob Extended Pattern-Matching Operators” in Appendix A 


6.8 Testing with Regular Expressions 


Problem 


Sometimes even the extended pattern matching of the extglob option isn’t 
enough. What you really need are regular expressions. Let’s say that you rip a 
CD of classical music into a directory, /s that directory, and see these names: 


$ ls 

Ludwig Van Beethoven - 01 - Allegro.ogg 

Ludwig Van Beethoven - 02 - Adagio un poco mosso.ogg 

Ludwig Van Beethoven - 03 - Rondo - Allegro.ogg 

Ludwig Van Beethoven - 04 - "Coriolan" Overture, Op. 62.0gg 
Ludwig Van Beethoven - 05 - "Leonore" Overture, No. 2 Op. 72.0gg 
$ 
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You’d like to write a script to rename these files to something simple, such as 
just the track number. How can you do that? 


Solution 


Use the regular expression matching of the =~ operator. Once it has matched 
the string, the various parts of the pattern are available in the shell variable 
SBASH_REMATCH. Example 6-4 is the part of the script that deals with the 
pattern match. 


Example 6-4. ch06/trackmatch 


#!/usr/bin/env bash 
# cookbook filename: trackmatch 


# 
for CDTRACK in * 
do 
if [[ "SCDTRACK" =~ "([[:alpha:][:blank:]]*)- ([[:digit:]]*) - (.*)$" 
]] 
then 
echo Track ${BASH_REMATCH[2]} is ${BASH_REMATCH[3]} 
mv "SCDTRACK" "Track${BASH_REMATCH[2]}" 
fi 
done 


WARNING 


This requires bash version 3.0 or newer—older versions don’t have the =~ 
operator. In addition, bash version 3.2 unified the handling of the pattern in the 
== and =~ conditional command operators but introduced a subtle quoting bug 
that was corrected in 3.2 patch #3. If the solution shown here fails, you may be 
using bash version 3.2 without that patch. You might want to upgrade to a 
newer version. You might also avoid the bug with a less readable version of the 
regular expression by removing the quotes around the regex and escaping each 
parenthesis and space character individually, which gets ugly quickly: 


if [[ "SCDTRACK" =~ \([[:alpha:][:blank:]]*\)-\ \ 
> ACL isdtgres])*\)\ -A ACDS ]] 
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Discussion 


If you are familiar with regular expressions from sed, awk, and older shells, 
you may notice a few slight differences with this newer form. Most 
noticeable are the character classes such as [: alpha: ] and that the grouping 
parentheses don’t need to be escaped—we don’t write \( here. as we would 
in sed. Here, \( would mean a literal parenthesis. 


The subexpressions, each enclosed in parentheses, are used to populate the 
bash builtin array variable S$BASH_REMATCH. The zeroth element 
(${BASH_REMATCH[0]}) is the entire string matched by the regular 
expression. Any subexpressions are available as ${BASH_REMATCH[ 1] }, 
${BASH_REMATCH[2]}, and so on. Any time a regular expression is used this 
way, it will populate the variable SBASH_REMATCH. Since other bash functions 
may want to use regular expression matching, you may want to assign this 
variable to one of your own naming as soon as possible, so as to preserve the 
values for your later use. In our example we use the values right away, inside 
our if /then clause, so we don’t bother to save them for use elsewhere. 


Regular expressions have often been described as write-only expressions 
because they can be very difficult to decipher. We’ll build this one up in 
several steps to show how we arrived at the final expression. The general 
layout of the filenames given to our datafiles, as in this example, seems to be 
like this: 


Ludwig Van Beethoven - 04 - "Coriolan" Overture, Op. 62.0gg 


That is, a composer’s name, a track number, and then the title of the piece, 
ending in .ogg (these were saved in Ogg Vorbis format, for smaller space and 
higher fidelity). 


At the lefthand side of the expression is an opening (or left) parenthesis. That 
begins our first subexpression. Inside it, we will write an expression to match 
the first part of the filename, the composer’s name—marked in bold here: 


({[:alpha: ][:blank:]]*)- ([[:digit:]]*) - (.*)$ 
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The composer’s name consists of any number of alphabetic characters and 
blanks. We use the square brackets to group the set of characters that will 
make up the name. Rather than write [a-zA-Z], we use the character class 
names [:alpha:] and [:blank: ] and put them inside the square brackets. 
This is followed by an asterisk to indicate zero or more repetitions. The right 
parenthesis closes off the first sub-expression, followed by a literal hyphen 
and a blank. 


The second subexpression (marked in bold here) will attempt to match the 
track number: 


([[: alpha: ][:blank:]]*)- C[{[:digit:]]*) - (.*)$ 


The second subexpression begins with another left parenthesis. The track 
numbers are integers, composed of digits (the character class [:digit: ]), 
which we write inside another pair of brackets followed by an asterisk as 
[[:++digit++: ]]* to indicate zero or more of what is in the brackets (i.e., 
digits). Then our pattern has the literals blank, hyphen, and blank. 


The final subexpression will catch everything else, including the track name 
and the file extension: 


([[:alpha: ][:blank:]]*)- ([[:digit:]]*) - (.*)$ 


This is the common and familiar .* regular expression, which means any 
number (*) of any character (.), again enclosed in parentheses. We end the 
expression with a dollar sign, which matches the end of the string. Matches 
are case-sensitive, but you may use shopt -s nocasematch (available in 
bash versions 3.1+) to change that. This option affects case and [[ 
commands. 


See Also 


m man regex (Linux, Solaris, HP-UX) or man re_format (BSD, Mac) for 
the details of your regular expression library 
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= Mastering Regular Expressions, 3rd Edition, by Jeffrey E. F. Friedl 
(O’Reilly) 


= Recipe 7.7, “Searching with More Complex Patterns” 
= Recipe 7.8, “Searching for an SSN” 
m Recipe 19.15, “Confusing Shell Wildcards and Regular Expressions” 


6.9 Changing Behavior with Redirections 


Problem 


Normally you want a script to behave the same regardless of whether input 
comes from a keyboard or a file, or whether output is going to the screen or a 
file. Occasionally, though, you want to make that distinction. How do you do 
that in a script? 


Solution 


Use test -t 0inanif statement to branch between the two desired 
behaviors. The 0 is the file descriptor for standard input; use a 1 to test for 
redirection of standard output. The test is true if the file descriptor is 
connected to a terminal, and false otherwise (e.g., false when redirected to a 
file or piped into another program). 


Discussion 


Think long and hard before you do this. So much of the power and flexibility 
of bash scripting comes from the fact that scripts can be pipelined together. 
Be sure you have a really good reason to make your script behave oddly 
when input or output is redirected. 


See Also 
= Recipe 2.18, “Using Multiple Redirects on One Line” 
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= Recipe 2.19, “Saving Output When Redirect Doesn’t Seem to Work” 
= Recipe 2.20, “Swapping STDERR and STDOUT” 

= Recipe 10.1, ““Daemon-izing” Your Script” 

= Recipe 15.9, “Using bash Net-Redirection” 

= Recipe 15.12, “Redirecting Output for the Life of a Script” 

= “I/O Redirection” in Appendix A 


6.10 Looping for a While 


Problem 


You want your shell script to perform some actions repeatedly as long as 
some condition is met. 


Solution 
Use the while looping construct for arithmetic conditions: 
while (( COUNT < MAX )) 
do 
some stuff 


let COUNT++ 
done 


for filesystem-related conditions: 


while [ -z "SLOCKFILE" ] 
do 

some things 
done 


or for reading input: 


while read Lineoftext 
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do 
process $lineoftext 
done 


Discussion 


The double parentheses in our first while statement delimit an arithmetic 
expression, very much like the $(( )) expression for shell variable 
assignment (see Recipe 6.1). The variable names mentioned inside the 
parentheses are meant to be dereferenced. That is, you don’t write SVAR, and 
instead use VAR inside the parentheses. 


The use of the square brackets in while [ -z"SLOCKFILE" ] is the same as 
with the if statement—the single square bracket is the same as using the 
test statement. 


The last example, while read Lineoftext, doesn’t have any parentheses, 
brackets, or braces. The syntax of the while statement in bash is defined such 
that the condition of the while statement is a list of statements to be executed 
(just like the if statement), and the exit status of the last one determines 
whether the condition is true or false. An exit status of zero indicates the 
condition is true; otherwise it’s false. 


The read statement returns a 0 on a successful read and a 1 on end-of-file, 
which means that the while will find it true for any successful read, but 
when the end of file is reached (and a 1 is returned) the whi le condition will 
be false and the looping will end. At that point, the next statement to be 
executed will be the statement after the done statement. 

This logic of “keep looping while the statement returns zero” might seem a 
bit flipped—most C-like languages use the opposite, namely, “loop while 
nonzero.” But in the shell, a zero return value means everything went well; 
nonzero return values indicate an error exit. 


This explains what happens with the (( )) construct, too. Any expression 
inside the parentheses is evaluated, and if the result is nonzero, then the result 
of the (( )) is to return a 0; similarly, a zero result returns a 1. This means 
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we can write expressions like Java or C programmers would, but the while 
statement still works as always in bash, expecting a zero result to be true. 


In practical terms, it means we can write an infinite loop like this: 


while (( 1 )); do 


done 


which “feels right” to a C programmer. But remember that the while 
statement is looking for a zero return value—which it gets because (( 1 )) 
returns 0 for a true (i.e., nonzero) result. 


Before we leave the while loop, let’s take one more look at that while read 
example, which is reading from standard input (1.e., the keyboard), and see 
how it might get modified in order to read input from a file instead of the 
keyboard. 


This is typically done in one of three ways. The first requires no real 
modifications to the statements at all. Rather, when the script is invoked, 
standard input is redirected from a file like this: 


myscript < file.name 


But suppose you don’t want to leave it up to the caller. If you know what file 
you want to process, or if it was supplied as a command-line argument to 
your script, then you can use this same while loop as is, but redirect the input 
from the file as follows: 


while read lineoftext 
do 

process that line 
done < file.input 


As a third option, you could begin by cat-ing the file to dump it to standard 
output, and then connect the standard output of that program to the standard 
input for the while statement: 
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cat file.input | 
while read lineoftext 
do 

process that line 
done 


WARNING 


Because of the pipe, the cat command and the while loop (including the 
process that line part) are each executing in their own separate processes. 
This means that if you use this method, the script commands inside the while 
loop cannot affect the other parts of the script outside the loop. For example, 
any variables that you set within the while loop will no longer have those 
values after the loop ends. Such is not the case, however, if you use while 
read ... done < file.input, because that isn’t a pipeline. 


See Also 


Recipe 6.1, “Doing Arithmetic in Your Shell Script” 
Recipe 6.2, “Branching on Conditions” 

Recipe 6.3, “Testing for File Characteristics” 

Recipe 6.4, “Testing for More than One Thing” 

Recipe 6.5, “Testing for String Characteristics” 

Recipe 6.6, “Testing for Equality” 

Recipe 6.7, “Testing with Pattern Matches” 

Recipe 6.8, “Testing with Regular Expressions” 

Recipe 6.11, “Looping with a read” 

Recipe 19.8, “Forgetting that Pipelines Make Subshells” 


. 11 Looping with a read 
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Problem 


You’re using the Subversion revision control system, which is executable as 
svn. (This example is very similar to what you would do for CVS as well.) 
When you check the status of a directory subtree to see what files have been 
changed, you see something like this: 


$ svn status bcb 

M bcb/amin.c 

? bcb/dmin.c 

? bcb/mdiv. tmp 

A bcb/optrn.c 

M bcb/optson.c 

? bcb/prtbout. 4161 

? bcb/rideaslist.odt 
? bcb/x.maxc 

$ 


The lines that begin with question marks are files about which Subversion 
has not been told; in this case they’re scratch files and temporary copies of 
files. The lines that begin with A are newly added files, and those that begin 
with M have been modified since the last changes were committed. 


To clean up this directory, it would be nice to get rid of all the scratch files. 


Solution 


A common use of a while loop is to read files and the output of previous 
commands. Try: 


svn status mysrc | grep '^?' | cut -c8- | 
while read FN; do echo "SFN"; rm -rf "SFN"; done 


or: 


svn status mysrc | 
while read TAG FN 
do 
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if [[ STAG == \? ]] 
then 
echo SFN 
rm -rf "SFN" 
fi 
done 


Discussion 


Both scripts will do the same thing—remove files that svn reports with a 
question mark. The same solutions may be adapted to work with other 
revision control systems. 


The first approach uses several subprograms to do its work (not a big deal in 
these days of gigahertz processors), and would fit on a single line in a typical 
terminal window. It uses grep to select only the lines that begin (signified by 
the ^) with a question mark. The expression '%?' is put in single quotes to 
avoid any special meanings that those characters have for bash. It then uses 
cut to take only the characters beginning in column eight (through the end of 
the line). That leaves just the filenames for the while loop to read. 


The read statement will return a nonzero value when there is no more input, 
so at that point the loop will end. Until then, it will assign the line of text that 
it reads each time into the variable $FN, and that is the filename that we 
remove. We use the -rf options in case the unknown file is actually a 
directory of files, and to remove even read-only files. If you don’t want/need 
to be so drastic in what you remove, leave those options off. 


The second script can be described as more shell-like, since it doesn’t need 
grep to do its searching (it uses the if statement) and it doesn’t need cut to 
do its parsing (it uses the read statement). We’ve also formatted it more like 
you would format a script in a file. If you were typing this at a command 
prompt, you could collapse the indentation, but for our use here the 
readability is much more important than saving a few keystrokes. 


The read in this second script reads into two variables, not just one. That is 
how we get bash to parse the line into two pieces—the leading character and 
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the filename. The read statement parses its input into words, like words on a 
shell command line. The first word on the input line is assigned to the first 
word in the list of variables in the read statement, the second word to the 
second variable, and so on. The last variable in the list gets the entire 
remainder of the line, even if it’s more than a single word. In our example, 
STAG gets the first word, which is the character (M, A, or ?); the whitespace 
defines the end of that word and the beginning of the next. The variable $FN 
gets the remainder of the line as the filename, which is significant here in 
case the filenames have embedded spaces. (We wouldn’t want just the first 
word of the filename.) The script then removes the filename and the loop 
continues. 


See Also 
= Appendix D 


6.12 Looping with a Count 


Problem 


You need to loop a fixed number of times. You could use a while loop and 
do the counting and testing, but programming languages have for loops for 
such a common idiom. How does one do this in bash ? 


Solution 


Use a special case of the for syntax, one that looks a lot like C, but with 
double parentheses: 


for (( i=0 ; i < 10 ; i++ )) ; do echo $i ; done 


Discussion 


In early versions of the shell, the original syntax for the for loop only 


214 


included iterating over a fixed list of items. It was a neat innovation for word- 
oriented shell scripts dealing with filenames and such. But when users needed 
to count, they sometimes found themselves writing: 


for iini23456/789 10 
do 

echo $i 
done 


Now that’s not too bad, especially for small loops, but let’s face it—it’s not 
going to work for 500 iterations. (Yes, you could nest loops 5 x 10, but come 
on!) What you really need is a for loop that can count. 


The variation of the for loop with C-like syntax has been in bash since 
version 2.04. Its more general form can be described as: 


for (( expri ; expr2 ; expr3 )) ; do list ; done 


The use of double parentheses is meant to indicate that these are arithmetic 
expressions. You don’t need to use the $ construct (as in $i, except for 
arguments like $1) when referring to variables inside the double parentheses 
(just like in the other places where double parentheses are used in bash). The 
expressions are integer arithmetic expressions and offer a rich variety of 
operators, including the use of the comma to put multiple operations within 
one expression: 


for (( i=0, j=0 ; i+j < 10 ; i++, j++ )) 
do 

echo $((i*j)) 
done 


That for loop initializes two variables ($i and $j), then has a more complex 
second expression adding the two together before doing the less-than 
comparison. The comma operator is used again in the third expression to 
increment both variables. 
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See Also 

= Recipe 6.1, “Doing Arithmetic in Your Shell Script” 
= Recipe 6.13, “Looping with Floating-Point Values” 
= Recipe 17.24, “Writing Sequences” 


6.13 Looping with Floating-Point Values 


Problem 


The for loop with arithmetic expressions only does integer arithmetic. What 
do you do for floating-point values? 


Solution 


Use the seg command to generate your floating-point values, if your system 
provides it: 


for fp in $(seq 1.0 .01 1.1) 
do 

echo $fp; other stuff too 
done 


or: 


seq 1.0 .01 1.1 | 
while read fp 
do 
echo $fp; other stuff too 
done 


Discussion 


The seq command will generate a sequence of floating-point numbers, one 
per line. The arguments to seq are the starting value, the increment, and the 
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ending value. This is not the intuitive order if you are used to the C language 
for loop, or if you learned your looping from BASIC (e.g., FOR I=4 TO 10 
STEP 2). With seq the increment is the middle argument. 


In the first example, the $() runs the command in a subshell and returns the 
result with the newlines replaced by just whitespace, so each value is a string 
value for the for loop. 


In the second example, seg is run as a command with its output piped into a 
while loop that reads each line and does something with it. This would be the 
preferred approach for a really long sequence, as it can run the seg command 
in parallel with the while. The for loop version has to run seg to completion 
and put all of its output on the command line for the for statement. For very 
large sequences, this could be time- and memory-consuming. 


See Also 


= Recipe 2.17, “Connecting Two Programs by Using Output as Arguments” 
= Recipe 6.12, “Looping with a Count” 
= Recipe 17.24, “Writing Sequences” 


6.14 Branching Many Ways 


Problem 


You have a series of comparisons to make, and the if /then/eLse is getting 
pretty long and repetitive. Isn’t there an easier way? 


Solution 


Use the case statement for a multiway branch: 


case SFN in 
* gif) gif2png SFN 


ee 
33 
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*.png) pngOK SFN 
*.jpg) jpg2gif SFN 
* tif | *.TIFF) tif2jpg $FN 


*) printf "File not supported: %s" SFN 


33 


esac 
The equivalent to this using if /then/else statements is: 


if [SEN == *.gif ]] 
then 
gif2png $FN 
elif [[ $FN == *.png ]] 
then 
pngOK SFN 
elif [[ $FN == *.jpg ]] 
then 
jpg2gif SFN 
elif [[ SFN == *.tif || SFN == *. TIFF ]] 
then 
tif2jpg $FN 
else 
printf "File not supported: %s" SFN 
fi 


Discussion 


The case statement will expand the word (including parameter substitution) 
between the case and in keywords. It will then try to match the word with 
the patterns listed in order. This is a very powerful feature of the shell. It is 
not just doing simple value comparisons, but string pattern matches (though 
not regular expressions). We have simple patterns in our example: *.gif 
matches any character sequence (signified by the *) that ends with the literal 
characters .gif. 


Use |, a vertical bar meaning logical OR, to separate different patterns for 
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which you want to take the same action. In our example, if $FN ends either 
with .tif or . TIFF then the pattern will match and the (fictional) tif2jpg 
command will be executed. 


There is no else or default keyword to indicate the statements to execute if 
no pattern matches. Instead, use * as the last pattern, since that pattern will 
match anything. Placing it last makes it act as the default and match anything 
that hasn’t already been matched. 


The double semicolon (; ;) ends the set of statements associated with a 
pattern. As of bash version 4, there are two other ways to end a set of 
statements. The ; ;& construct means that even if a match is found, the next 
pattern will be tested for a match and its statements will be executed as well 
if another match is found. The ;& construct means that execution will “fall 
through,” and the next set of statements will be executed regardless of 
whether its pattern matches. Here is a somewhat pointless example to show 
the use of these features: 


# use other endings for case 


case SFN in 
x gif) gif2png SFN 


238 # keep Looking 
*. png) pngOK SFN 

338 # keep Looking 
*.jpg) jpg2gif SFN 

ee # keep Looking 
* tif) tif2jpg $FN 

3& # fall through 


*.* ) echo "two.words" 
33 
* ) echo "oneword" 
esac 


If $FN matches any of the first four patterns bash will execute its (fictional) 
conversion command, but also keep looking; it will find that it matches the 
fifth pattern as well and therefore also echo the phrase two.words. 
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NOTE 


An aside to C/C++ and Java programmers: the bash case is similar to the 
switch statement, and each pattern corresponds to a case. Notice, though, that 
the variable on which you can switch/case is a shell variable (typically a string 
value) and the cases are patterns (not just constant values). The patterns end 
with a right parenthesis (not a colon). The equivalent to the break in C/C++ 
and Java switch statements is, in bash, a double semicolon. The equivalent to 
their default keyword is, in bash, the * pattern. 


Matches are case-sensitive, but you may use shopt -s nocasematch 
(available in bash versions 3.1+) to change that This option affects case and 
[ [ commands. 


We end the case statement with an esac (that’s “c-a-s-e” spelled backward; 
this came from Algol 68). 


See Also 


m help case 
= help shopt 


= Recipe 6.2, “Branching on Conditions” 


6.15 Parsing Command-Line Arguments 


Problem 


You want to write a simple shell script to print a line of dashes, but you want 
to parameterize it so that you can specify different line lengths and specify a 
character to use other than just a dash. The syntax would look like this: 


dashes # would print out 72 dashes 
dashes 50 # would print out 50 dashes 
dashes -c = 50 # would print out 50 equals signs 
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dashes -c x # would print out 72 x characters 


What’s an easy way to parse those simple arguments? 


Solution 


For serious scripting, you should use the getopts builtin. But we would like to 
show you the case statement in action, so for this simple situation we’ll use 
case for argument parsing. 


Example 6-5 shows the beginning of the script (see Recipe 12.1 fora 
complete version). 


Example 6-5. ch06/dashes 


#!/usr/bin/env bash 
# cookbook filename: dashes 


# 
# dashes - print a line of dashes 
# 
# options: # how many (default 72) 
# -c X use char X instead of dashes 
# 
LEN=72 
CHAR=' -' 
while (( $# > 0 )) 
do 
case $1 in 
[0-9]*) LEN=$1 
-c) shift; 
CHAR=${1: - -} 
*) printf 'usage: %s [-c X] [#]\n' S{O##*/} >&2 
exit 2 
esac 
shift 
done 
# 
# more... 
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Discussion 


The default length (72) and the default character (-) are set at the beginning 
of the script (after some useful comments). The while loop allows us to parse 
more than one parameter. It will keep looping while the number of arguments 
(S#) is above zero. 


The case statement matches three different patterns. First, the [0-9]* will 
match any digit followed by any other characters. We could have used a more 
elaborate expression to allow only pure numbers, but we’ll assume that any 
argument that begins with a digit is a number. If that isn’t true (e.g., if the 
user types 1T4), then the script will error when it tries to use $LEN. We can 
live with that for now. 


The second pattern is a literal -c. There is no pattern to this, just an exact 
match. In that case, we use the shift builtin command to throw away that 
argument (now that we know what it is) and we take the next argument 
(which has now become the first argument, so it is referenced as $1) and save 
that as the new character choice. We use : - when referencing $1 (as in ${1: - 
x}) to specify a default value if the parameter isn’t set. That way, if the user 
types -c but fails to specify an argument, it will use the default, specified as 
the character immediately following the : -. In the expression ${1: -x} it 
would be x. For our script, we wrote ${1: - - } (note the two minus signs), so 
the character taken as the default is the (second) minus sign. 


The third pattern is the wildcard pattern (*), which matches everything, so 
that any argument unmatched by the previous patterns will be matched here. 
Placed last in the case statement, it is the catch-all that notifies the user of an 
error (since it wasn’t one of the prescribed parameters); it prints a message 
instructing the user about correct usage. 


That printf error message probably needs explaining if you’re new to bash. 
There are four sections of that statement to look at. The first is simply the 
command name, printf. The second is the format string that printf will use 
(see Recipe 2.3 and “printf” in Appendix A). We use single quotes around the 
string so that the shell doesn’t try to interpret any of the string. The last part 
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of the line (>&2) tells the shell to redirect the output to standard error. Since 
this is an error message, that seems appropriate. Many script writers are 
casual about this and often neglect this redirection on error messages. We 
think it is a good habit to always redirect error messages to standard error. 


The third part of the line uses string manipulation on $0. This is a common 
idiom used to strip off any leading path part of how the command was 
invoked. For example, consider what would happen if we used only $0. Here 
are two different but erroneous invocations of the same script. Notice the 
error messages: 


$ dashes -g 
usage: dashes [-c X] [#] 


$ /usr/local/bin/dashes -g 
usage: /usr/local/bin/dashes [-c X] [#] 


In the second invocation, we used the full pathname. The error message then 
also contained the full pathname. Some people find this annoying. So, we 
strip $0 down to just the script’s base name (similar to using the basename 
command). Then the error messages look the same regardless of how the 
script is invoked: 


$ dashes -g 
usage: dashes [-c X] [#] 


$ /usr/local/bin/dashes -g 
usage: dashes [-c X] [#] 


While this certainly takes a bit more time than just hardcoding the script 
name or using $0 without trimming it, the script is more portable this way—if 
you change the script’s name you don’t have to modify the code. If you 
prefer to use the basename command in a subshell, that is also worthwhile, as 
the extra time isn’t that vital. This is an error message and the script is about 
to exit anyway. 


We end the case statement with an esac and then do a shift so as to consume 
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the argument that we just matched in our case statement. If we didn’t do that, 
we'd be stuck in the while loop, parsing the same argument over and over. 
The shift will cause the second argument ($2) to become the first ($1) and the 
third to become the second, and so on, but also $# to be one smaller. On some 
iteration of the loop $# finally reaches zero (when there are no more 
arguments), and the loop terminates. 


The actual printing of the dashes (or other character) is not shown here, as we 
wanted to focus on the case statement and related actions. You can see the 
complete script, with a function for the usage message, in its entirety in 
Recipe 12.1. 


See Also 


m help case 

=» help getopts 

= Recipe 2.3, “Writing Output with More Formatting Control” 
= Recipe 5.8, “Looping Over Arguments Passed to a Script” 

= Recipe 5.11, “Counting Arguments” 

= Recipe 5.12, “Consuming Arguments” 

= Recipe 5.20, “Using bash for basename” 

= Recipe 6.15, “Parsing Command-Line Arguments” 

= Recipe 12.1, “Starting Simple by Printing Dashes” 

= Recipe 13.1, “Parsing Arguments for Your Shell Script” 

= Recipe 13.2, “Parsing Arguments with Your Own Error Messages” 


a “printf” in Appendix A 


6.16 Creating Simple Menus 


Problem 
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You have a simple SQL script that you would like to run against different 
databases to reset them for tests that you want to run. You could supply the 
name of the database on the command line, but you want something more 
interactive. How can you write a shell script to choose from a list of names? 


Solution 


Use the select statement to create simple character-based screen menus, as 
in Example 6-6. 


Example 6-6. ch06/dbinit.1 


#!/usr/bin/env bash 
# cookbook filename: dbinit.1 
# 
DBLIST=S(sh ./listdb | tail -n +2) 
select DB in SDBLIST 
do 
echo Initializing database: $DB 
mysql -u user -p SDB <myinit.sql 
done 


Ignore for a moment how SDBLIST gets its values; just know that it is a list of 
words (like the output from /s would give). The select statement will 
display those words, each preceded by a number, and the user will be 
prompted for input. The user makes a choice by typing the number and the 
corresponding word is assigned to the variable specified after the keyword 
select (in this case, DB). 


Here’s what the running of this script might look like: 


$ ./dbinit 

1) testDB 

2) simpleInventory 

3) masterInventory 

4) otherDB 

#? 2 

Initializing database: simpleInventory 
#? 

$ 
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Discussion 


When the user types “2” the variable DB is assigned the word 
simpLeInventory. If you really want to get at the user’s literal choice, the 
variable SREPLY will hold it; in this case it would be 2. 


The select statement is really a loop. When the user has entered a choice it 
will execute the body of the loop (between the do and the done) and then 
reprompt for the next value. 


It doesn’t redisplay the list every time, only if the user makes no choice and 
just presses the Enter key. So, to see the list again, the user can press Enter. 


It also does not reevaluate the code after the in—that is, you can’t alter the 
list once you’ve begun. If you modified SDBLIST inside the loop, it wouldn’t 
change the list of choices. 


The looping will stop when it reaches the end of the file, which for interactive 
use means when the user types Ctrl-D. (If you piped a series of choices into a 
select loop, it would end when the input ends.) 


There isn’t any formatting control over the list, though it will take the value 
of SCOLUMNS into account. If you’re going to use select, you have to be 
satisfied with the way it displays your choices. You can, however, alter the 
prompt on the select using the $PS3 variable, as we discuss next and in 
Recipe 16.12. 


See Also 


= Recipe 3.7, “Selecting from a List of Options” 
= Recipe 6.17, “Changing the Prompt on Simple Menus” 
m Recipe 16.2, “Customizing Your Prompt” 


= Recipe 16.12, “Using Secondary Prompts: $PS2, $PS3, $PS4” 
6.17 Changing the Prompt on Simple Menus 
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Problem 


You just don’t like that prompt in the select menus. How can it be changed? 


Solution 


The bash environment variable $PS3 is the prompt used by select. Set it to a 
new value and you’ll get a new prompt. 


Discussion 


This is the third of the bash prompts. The first ($PS1) is the prompt you get 
before most commands. (We’ve used $ in our examples, but it can be much 
more elaborate than that, including the user ID or directory name.) If a line of 
command input needs to be continued, the second prompt is used ($PS2). 


For select loops, the third prompt, $PS3, is used. Set it before the select 
statement to make the prompt be whatever you want. You can even modify it 
within the loop to have it change as the loop progresses. 


The script in Example 6-7 is similar to the one in the previous recipe, but it 
counts how many times it has handled a valid input. 


Example 6-7. ch06/dbinit.2 


#!/usr/bin/env bash 

# cookbook filename: dbinit.2 

# 

DBLIST=S(sh ./listdb | tail -n +2) 


PS3="0 ‘inits >" 
select DB in $DBLIST 
do 
if [ SDB ] 
then 
echo Initializing database: $DB 
PS3="$((++1)) inits> " 
mysql -u user -p SDB <myinit.sql 
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fi 
done 


We’ve added some extra whitespace to make the setting of $PS3 stand out 
more. The if statement assures us that we’re only counting the times when 
the user entered a valid choice. Such a check would have been useful in the 
previous version, but we were keeping it simple. 


See Also 


m Recipe 3.7, “Selecting from a List of Options” 

= Recipe 6.17, “Changing the Prompt on Simple Menus” 

= Recipe 16.2, “Customizing Your Prompt” 

= Recipe 16.12, “Using Secondary Prompts: $PS2, $PS3, $PS4” 


6.18 Creating a Simple RPN Calculator 


Problem 


You may be able to convert binary to decimal, octal, or hex in your head, but 
it seems that you can’t do simple arithmetic anymore and you can never find 
a calculator when you need one. What to do? 


Solution 


Create a calculator using shell arithmetic and RPN notation, as in Example 6- 
8. 


Example 6-8. ch06/rpncalc 


#!/usr/bin/env bash 

# cookbook filename: rpncalc 

# 

# simple RPN command-line (integer) calculator 
# 

# takes the arguments and computes with them 
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# of the form a b op 

# allow the use of x instead of * 

# 

# error check our argument counts: 

if [ \C $# -lt 3 \) -o \( $(($# % 2)) -eq © \) ] 

then 
echo "usage: calc number number op [ number op ] ..." 
echo "use x or '*' for multiplication" 
exit 1 

Fi 


ANS=$( ($1 ${3//x/*} $2)) 

shift 3 

while [ $# -gt 0 ] 

do 
ANS=$((ANS ${2//x/*} $1)) 
shift 2 

done 

echo SANS 


Discussion 


The RPN (or postfix) style of notation puts the operands (the numbers) first, 
followed by the operator. If we are using RPN, we don’t write 5 + 4 but 
rather 5 4 + as our expression. If you want to multiply the result by 2, then 
you just put 2 * on the end, so the whole expression would be 5 4 + 2 *, 
which is great for computers to parse because you can go left to right and 
never need parentheses. The result of any operation becomes the first operand 
for the next expression. 


In our simple bash calculator we will allow the use of a lowercase x as a 
substitute for the multiplication symbol since * has special meaning to the 
shell. But if you escape that special meaning by writing '*' or \* we want 
that to work, too. 


How do we error check the arguments? We will consider it an error if there 
are less than three arguments (we need two operands and one operator, e.g., 6 
3 /). There can be more than three arguments, but in that case there will 
always be an odd number (since we start with three and add two more, a 
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second operand and the next operator, and so on, always adding two more; 
the valid number of arguments would be 3 or 5 or 7 or 9 or...). We check that 
with the expression: 


$((S# % 2)) -eq 0 


to see if the result is zero. The $(( )) says we’re doing some shell arithmetic 
inside. We are using the % operator (called the remainder operator) to see if 
$# (which is the number of arguments) is divisible by 2 with no remainder 
(i.e., -eq 0). 


WARNING | 


Any arithmetic done within $(( )) is integer arithmetic only. 


Now that we know there are the right number of arguments, we can use them 
to compute the result. We write: 


ANS=$( ($1 ${3//x/*} $2)) 


which will compute the result and substitute the asterisk for the letter x at the 
same time. When you invoke the script you give it an RPN expression on the 
command line, but the shell syntax for arithmetic is our normal (infix) 
notation. So, we can evaluate the expression inside of $(( )) but we have to 
switch the arguments around. Ignoring the x-to-* substitution for the 
moment, you can see it is just: 


ANS=$(($1 $3 $2)) 


which just moves the operator between the two operands. bash will substitute 
the parameters before doing the arithmetic evaluation, so if $1 is 5 and $2 is 4 
and $3 is a +, then after parameter substitution bash will have: 
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ANS=$((5 + 4)) 


and it will evaluate that and assign the result, 9, to SANS. Done with those 
three arguments, we shift 3 to toss them and get the new arguments into 
play. Since we’ve already checked that there are an odd number of 
arguments, if we have any more arguments to process we will have at least 
two more (only one more and it would be an even number, since 3+1=4). 


From that point on we loop, taking two arguments at a time. The previous 
answer is the first operand, the next argument (now $1 as a result of the shift) 
is our second operand, and we put the operator inside $2 in between and 
evaluate it all much like before. Once we are out of arguments, the answer is 
what we have in SANS. 


One last word about the substitution. ${2} would be how we refer to the 
second argument. Though we often don’t bother with the {} and just write 
$2, we need them here for the additional operations we will ask bash to 
perform on the argument. We write ${2//x/*} to say that we want to replace 
or substitute (//) an x with (indicated by the next /) an * before returning the 
value of $2. We could have written this in two steps by creating an extra 
variable: 


OP=${2//x/*} 
ANS=$((ANS OP $1)) 


That extra variable can be helpful as you first begin to use these features of 
bash, but once you are familiar with these common expressions, you’ll find 
yourself putting them all together on one line (even though itll be harder to 
read). 


Are you wondering why we didn’t write SANS and SOP in the expression that 
does the evaluation? We don’t have to use the $ on variable names inside of 
$(( )) expressions, except for the positional parameters (e.g., $1, $2). The 
positional parameters need it to distinguish them from regular numbers (e.g., 
1, 2). 
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See Also 
= Chapter 5 
= Recipe 6.1, “Doing Arithmetic in Your Shell Script” 


= Recipe 6.19, “Creating a Command-Line Calculator” 


6.19 Creating a Command-Line Calculator 


Problem 


You need more than just integer arithmetic, and you’ve never been very fond 
of RPN notation. How about a different approach to a command-line 
calculator? 


Solution 


Create a trivial command-line calculator using awk’s built-in floating-point 
arithmetic expressions, as in Example 6-9. 
Example 6-9. ch06/func_calc 
# cookbook filename: func_calc 
# Trivial command-line calculator 
function calc { 
# INTEGER ONLY! --> echo The answer is: $(( $* )) 
# Floating point 


awk "BEGIN {print \"The answer is: \" $* }"; 
} # end of calc 


Discussion 


You may be tempted to skip the awk command and try echo The answer 
is: $$$*SS. This will work fine for integers, but will truncate the results of 
floating-point operations. 


We use a function because aliases (see Recipe 10.7) do not allow the use of 
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arguments. 
You will probably want to add this function to your global /etc/bashrc or 
local ~/.bashrc. 


The operators are what you’d expect and are the same as in C: 


$ calc 2+3+4 
The answer is: 9 


$ calc 2 +3 + 4.5 
The answer is: 9.5 


Watch out for shell metacharacters. For example: 


$ calc (2+2-3)*4 
-bash: syntax error near unexpected token `2+2-3' 


You need to escape the special meaning of the parentheses. You can put the 
expression inside single quotes, or just use a backslash in front of any special 
(to the shell) character to escape its meaning. For example: 


$ calc '(2+2-3)*4' 
The answer is: 4 


$ calc \(2+2-3\)\*4 
The answer is: 4 


$ calc '(2+2-3)*4.5' 
The answer is: 4.5 


We need to escape the multiplication symbol too, since that has special 
meaning to bash as the wildcard for filenames. This is especially true if you 
like to put whitespace around your operators, as in 17 + 3 * 21, because 
then * will match all the files in the current directory, putting their names on 
the command line in place of the asterisk—definitely not what you want. 


See Also 
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man awk 

“ARITHMETIC EVALUATION” in the bash(/) manpage 
Recipe 6.18, “Creating a Simple RPN Calculator” 

Recipe 10.7, “Redefining Commands with alias” 


Recipe 16.8, “Shortening or Changing Command Names” 
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Chapter 7. Intermediate Shell 
Tools I 


It is time to expand our repertoire. This chapter’s recipes use some utilities 
that are not part of the shell, but which are so useful that it is hard to imagine 
using the shell without them. 


One of the overarching philosophies of Unix (and thus Linux) is that of small 
(i.e., limited in scope) program pieces that can be fit together to provide 
powerful results. Rather than have one program that does everything, we 
have many different programs that each do one thing well. 


That applies to bash as well. While it’s getting big and feature-rich, it still 
doesn’t try to do everything, and there are times when it is easier to use other 
commands to accomplish a task even if bash can be stretched to do it. 


A simple example of this is the /s command. You needn’t use /s to see the 
contents of your current directory. You could just type echo * to have 
filenames displayed. Or you could even get fancier, using the bash printf 
command and some formatting, etc. But that’s not really the purpose of the 
shell, and someone has already provided a listing program (/s) to deal with all 
sorts of variations in filesystem information. 


Perhaps more importantly, by not expecting bash to provide more filesystem 
listing features, we avoid additional feature creep pressures and instead give 
it some measure of independence; /s can be released with new features 
without requiring that we all upgrade our bash versions. 


But enough philosophy—back to the practical. 


What we have here are three of the most useful text-related utilities: grep, 
sed, and awk. 


The grep program searches for strings, the sed program provides a way to 
edit text as it passes through a pipeline, and awk...well, awk is its own 
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interesting beast, a precursor to perl and a bit of a chameleon—it can look 
quite different depending on how it is used. 


These utilities, and a few more that we will discuss in the next chapter, are 
very much a part of most shell scripts and most sessions spent typing 
commands to bash. If your shell script requires a list of files on which to 
operate, it is likely that either find or grep will be used to supply that list of 
files, and it’s likely that sed and/or awk will be used to parse the input or 
format the output at some stage of the shell script. 


To say it another way, if our scripting examples are going to tackle real-world 
problems, they need to use the wider range of tools that are actually used by 
real-world bash users and programmers. 


Tel Sitting Through Files for a String 


Problem 


You need to find all occurrences of a string in one or more files. 


Solution 


The grep command searches through files looking for the expression you 
supply: 


$ grep printf *.c 


both.c: printf("Std Out message.\n", argv[0], argc-1); 

both.c: fprintf(stderr, "Std Error message.\n", argv[@], argc-1); 
good.c: printf("%s: %d args.\n", argv[Q], argc-1); 

somio.c: // we'll use printf to tell us what we 

somio.c: printf( "open: fd=%d\n", iod[i]); 

$ 


The files we searched through in this example were all in the current 
directory. We just used the simple shell pattern *.c to match all the files 
ending in .c with no preceding pathname. 
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Not all the files through which you want to search may be that conveniently 
located. Of course, the shell doesn’t care how much pathname you type, so 
we could have done something like this: 


grep printf ../lib/*.c ../server/*.c ../cmd/*.c */*.c 


Discussion 


When more than one file is searched, grep begins its output with the 
filename, followed by a colon. The text after the colon is what actually 
appears in the files that grep searched. 


The search matches any occurrence of the specified characters, so a line that 
contained the string “fprintf” was returned, since “printf” is contained within 
“fprintf”. 


The first (nonoption) argument to grep can be just a simple string, as in this 
example, or it can be a more complex regular expression (regexp). These 
regexps are not the same as the shell’s pattern matching, though they can 
look similar at times. Pattern matching is so powerful that you may find 
yourself relying on it to the point where yov’ll start using “grep” as a verb, 
and wishing you could make use of it everywhere, as in “I wish I could grep 
my desk for that paper you wanted.” 


You can vary the output from grep using command-line options. If you don’t 
want to see the specific filenames, you may turn this off using the -h option 
to grep: 


$ grep -h printf *.c 
printf("Std Out message.\n", argv[0], argc-1); 
fprintf(stderr, "Std Error message.\n", argv[0], argc-1); 
printf("%s: %d args.\n", argv[0], argc-1); 
// we'll use printf to tell us what we 
printf("open: fd=%d\n", iod[i]); 


If you don’t want to see the actual lines from the file, but only a count of the 
number of times the expression is found, then use the -c option: 
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$ grep -c printf *.c 
both.c:2 
good.c:1 
somio.c:2 


$ 


WARNING 


A common mistake is to forget to provide grep with a source of input—for 
example, grep myvar. In this case grep assumes you will provide input from 
STDIN, but you think it will get it from a file. So it just sits there forever, 
seemingly doing nothing. (In fact, it is waiting for input from your keyboard.) 
This is particularly hard to catch when you are grepping a large amount of data 
and expect it to take a while. 


See Also 


í 


man grep 


man regex (Linux, Solaris, HP-UX) or man re_format (BSD, Mac) for 
the details of your regular expression library 


Mastering Regular Expressions, 3rd Edition, by Jeffrey E. F. Friedl 
(O’Reilly) 

Classic Shell Scripting by Nelson H. F. Beebe and Arnold Robbins 
(O’Reilly), Sections 3.1 and 3.2 


Chapter 9 and the find utility, for more far-reaching searches 


Recipe 9.5, “Finding Files Irrespective of Case” 


.2 Getting Just the Filename from a Search 


Problem 


You need to find the files in which a certain string appears. You don’t want 
to see the line of text that was found, just the filenames. 
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Solution 


Use the -l option of grep to get just the filenames: 


$ grep -l printf *.c 
both.c 
good.c 
somio.c 


$ 


Discussion 
If grep finds more than one match per file, it still only prints the name once. 
If grep finds no matches, it gives no output. 


This option is handy if you want to build a list of files to be operated on, 
based on the fact that they contain the string that you’re looking for. Put the 
grep command inside $() and those filenames can be used on the command 
line. 


For example, to remove the files that contain the phrase “This file is 
obsolete,” you could use this shell command combination: 


rm -i $(grep -l 'This file is obsolete' * ) 


We’ve added the -i option to rm so that it will ask you before it removes 
each file. That’s obviously a safer way to operate, given the power of this 
combination of commands. 


bash expands the * to match every file in the current directory (but does not 
descend into subdirectories) and passes them as the arguments to grep. Then 
grep produces a list of filenames that contain the given string. This list then is 
handed to the rm command to remove each file. 


See Also 


m man grep 
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m man FM 


m man regex (Linux, Solaris, HP-UX) or man re_format (BSD, Mac) for 
the details of your regular expression library 


= Mastering Regular Expressions, 3rd Edition, by Jeffrey E. F. Friedl 
(O’ Reilly) 
= Recipe 2.15, “Connecting Two Programs by Using Output as Input” 


= Recipe 9.5, “Finding Files Irrespective of Case” 


7.3 Getting a Simple True/False from a Search 


Problem 


You need to know whether a certain string is in a particular file, and you just 
want a yes or no sort of answer. 


Solution 


Use -q, the “quiet” option for grep. Or, for maximum portability, just throw 
the output away by redirecting it into /dev/null. Either way, your answer is in 
the bash return status variable $?, so you can use it in an if test like this: 


$ if grep -q findme bigdata.file ; then echo yes ; else echo nope ; fi 
nope 


$ 


Discussion 


In a shell script, you often don’t want the results of the search displayed in 
the output; you just want to know whether there is a match so that your script 
can branch accordingly. 


As with most Unix/Linux commands, a return value of 0 indicates successful 
completion. In this case, success is defined as having found the string in at 
least one of the given files (in this example, we searched in only one file). 
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The return value is stored in the shell variable $?, which we can then use in 
an if statement. 


If we list multiple filenames after grep -q, then grep stops searching after 
the very first occurrence of the search string being found. It doesn’t search all 
the files, as you really just want to know whether it found any occurrence of 
the string. If you really need to read through all the files (why?), then rather 
than use -q you can do this: 


$ if grep findme bigdata.file > /dev/null ; then echo yes ; else echo 
nope ; fi 
nope 


$ 
The redirecting to /dev/null sends the output to a special kind of device, a bit 
bucket, that just throws away everything you give it. 


The /dev/null technique is also useful if you want to write shell scripts that 
are portable across the various flavors of grep available on Unix and Linux 
systems, should you find one that doesn’t support the -q option. 


See Also 
m man grep 


m man regex (Linux, Solaris, HP-UX) or man re_format (BSD, Mac) for 
the details of your regular expression library 


= Mastering Regular Expressions, 3rd Edition, by Jeffrey E. F. Friedl 
(O’ Reilly) 


= Recipe 9.5, “Finding Files Irrespective of Case” 


7.4 Searching for Text While Ignoring Case 


Problem 
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You need to search for a string (e.g., “error”) in a logfile, and you want to do 
it case-insensitively to catch all occurrences. 


Solution 


Use the -i option on grep to ignore case: 


grep -i error logfile.msgs 


Discussion 


A case-insensitive search finds messages written “ERROR,” “error,” and 
“Error,” as well as ones like “ErrOR” and “eRrOr.” This option is particularly 
useful for finding words anywhere that you might have mixed-case text, 
including words that might be capitalized at the beginning of a sentence or in 
email addresses. 


See Also 
m man grep 


m man regex (Linux, Solaris, HP-UX) or man re_format (BSD, Mac) for 
the details of your regular expression library 


= Mastering Regular Expressions, 3rd Edition, by Jeffrey E. F. Friedl 
(O’ Reilly) 


= Recipe 9.5, “Finding Files Irrespective of Case” 


7.5 Doing a Search in a Pipeline 


Problem 


You need to search for some text, but the text you’re searching for isn’t in a 
file; instead, it’s in the output of a command or perhaps even the output of a 
pipeline of commands. 
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Solution 


Just pipe your results into grep: 


some pipeline | of commands | grep 


Discussion 


When no filename is supplied to grep, it reads from standard input. Most 
well-designed utilities meant for shell scripting will do this. It is one of the 
things that makes them so useful as building blocks for shell scripts. 


If you also want to have grep search through error messages that come from 
the previous command, be sure to redirect its error output into standard 
output before the pipe: 


gcc bigbadcode.c 2>&1 | grep -i error 


This command attempts to compile some hypothetical hairy piece of code. 
We redirect standard error into standard output (2>&1) before we proceed to 
pipe (|) the output into grep, where it will search case-insensitively (- 1) 
looking for the string error. 


Don’t overlook the possibility of grepping the output of grep. Why would 
you want to do that? To further narrow down the results of a search. Let’s say 
you wanted to find out Bob Johnson’s email address: 


$ grep -i johnson mail/* 
. too much output to think about; there are lots of Johnsons in the 
world ... 
$ !! | grep -i robert 
grep -i johnson mail/* | grep -i robert 
. more manageable output ... 
$ !! | grep -i "the bluesman" 
grep -i johnson mail/* | grep -i robert | grep -i "the bluesman" 
Robert M. Johnson, The Bluesman <rmj@nopLlace.org> 


You could have retyped the first grep, but this example also shows the power 
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of the !! history operator (see Recipe 18.2). The !! lets you repeat the 
previous command without retyping it. You can then continue adding to the 
command line after the ! ! as we show here. The shell will display the 
command that it runs, so that you can see what you got as a result of the ! ! 
substitution. 


You can build up a long grep pipeline very quickly and simply this way, 
seeing the results of the intermediate steps as you go and deciding how to 
refine your search with additional grep expressions. You could also 
accomplish the same task with a single grep and a clever regular expression, 
but we find that building up a pipeline incrementally is easier. 


See Also 
m man grep 


m man regex (Linux, Solaris, HP-UX) or man re_format (BSD, Mac) for 
the details of your regular expression library 


= Mastering Regular Expressions, 3rd Edition, by Jeffrey E. F. Friedl 
(O’ Reilly) 


= Recipe 2.15, “Connecting Two Programs by Using Output as Input” 
= Recipe 9.5, “Finding Files Irrespective of Case” 


= Recipe 18.2, “Repeating the Last Command” 


7.6 Paring Down What the Search Finds 


Problem 


Your search is returning way more than you expected, including many results 
you don’t want. 


Solution 


Pipe the results into grep -v with an expression that describes what you 
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don’t want to see. 


Let’s say you were searching for messages in a logfile, and you wanted all 
the messages from the month of December. You know that your logfile uses 
the three-letter abbreviation Dec for December, but you’re not sure if it’s 
always abbreviated, so to be sure to catch all the messages you type: 


grep -i dec logfile 


But then you get output like this: 


error on Jan 01: not a decimal number 

error on Feb 13: base converted to Decimal 
warning on Mar 22: using only decimal numbers 
error on Dec 16 : the actual message you wanted 
error on Jan 01: not a decimal number 


A quick and dirty solution in this case is to pipe the first result into a second 
grep and tell the second grep to ignore any instances of “decimal”: 


grep -i dec logfile | grep -vi decimal 


It’s not uncommon to string a few of these together (as new, unexpected 
matches are also discovered) to filter down the search results to what you’re 
really looking for: 


grep -i dec logfile | grep -vi decimal | grep -vi decimate 


Discussion 


The “dirty” part of this “quick and dirty” solution is that the solution here 
might also get rid of some of the December log messages, ones that you 
wanted to keep—if they have the word “decimal” in them, they’ II be filtered 
out by the grep -v. 
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The -v option can be handy if used carefully; you just have to keep in mind 
what it might exclude. 


For this particular example, a better solution would be to use a more powerful 
regular expression to match the December date, one that looked for “Dec” 
followed by a space and two digits: 


grep 'Dec [0-9][0-9]' logfile 


But that often won’t work either because sys/og uses a space to pad single- 
digit dates. To account for this, we can add a space in the first list: 


grep 'Dec [0-9 ][0-9]' logfile 


We used single quotes around the expression because of the embedded 
spaces, and to avoid any possible shell interpretation of the bracket characters 
(not that there would be, but just as a matter of habit). It’s good to get into the 
habit of using single quotes around anything that might possibly be confusing 
to the shell. We could have written: 


grep Dec\ [0-9\ ][0-9] logfile 


escaping the spaces with a backslash, but in that form it’s harder to see where 
the search string ends and the filename begins. 


See Also 


m man grep 


m man regex (Linux, Solaris, HP-UX) or man re_format (BSD, Mac) for 
the details of your regular expression library 


= Mastering Regular Expressions, 3rd Edition, by Jeffrey E. F. Friedl 
(O’ Reilly) 


= Recipe 9.5, “Finding Files Irrespective of Case” 
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7.7 Searching with More Complex Patterns 


The regular expression mechanism of grep provides for some very powerful 
patterns that can fit most of your needs. 


A regular expression describes patterns for matching against strings. Any 
alphabetic character (or other character without special meaning to the shell) 
just matches that character in the string. “A” matches A, “B” matches B; no 
surprise there. The next important rule is to combine letters just by position, 
so AB matches “A” followed by “B”. This, too, seems obvious. But regular 
expressions define other special characters that can be used by themselves or 
in combination with other characters to make more complex patterns. 


The first special character is the period (.), which matches any single 
character. Therefore, .... matches any four characters; A. matches an “A” 
followed by any character; and .A. matches any character, then an “A”, then 
any character (not necessarily the same character as the first). 


An asterisk (*) matches zero or more occurrences of the previous character, 
so A* matches zero or more “A” characters, and .* matches zero or more 
characters of any sort (such as “abcdefg”, “aaaabc’’, “sdfgf ;Ikjhy”, or even an 
empty line). 

So what does ..* mean? It matches any single character followed by zero or 
more of any character (i.e., one or more characters, but not an empty line). 


Speaking of lines, the caret ^ matches the beginning of a line of text and the 
dollar sign $ matches the end of a line; hence, ^$ matches an empty line (the 
beginning followed by the end, with nothing in between). 


What if you want to match an actual period, caret, dollar sign, or any other 
special character? Precede it by a backslash (\). ton. matches the letters 
“ion” followed by any other letter, but ion\. matches “ion” bounded by a 
period (e.g., at the end of a sentence or wherever else it appears with a 
trailing dot). 

A set of characters enclosed in square brackets (e.g., [abc ]) matches any one 
of those characters (e.g., “a” or “b” or “c”). If the first character inside the 


247 


square brackets is a caret, then it matches any character that is not in that set. 


For example, [AaEeIiO0oUu] matches any of the vowels, and [*AaEeIiOoUu ] 
matches any character that is not a vowel. This last case is not the same as 
saying that it matches consonants, because [*AaEeIi0oUu] also matches 
punctuation and other special characters that are neither vowels nor 
consonants. 


Another mechanism we want to introduce is a repetition mechanism called an 
“interval expression,” written as \{n,m\}, where n is the minimum number 
of repetitions and m is the maximum. If it is written as \{n\} it means 
“exactly n times,” and when written as \{n, \} it means “at least n times.” 


For example, the regular expression A\{5\} matches exactly five “A” 
characters in a row, whereas A\{5,\} matches five or more “A” characters. 


See Also 
=m man grep 


= Recipe 7.8, “Searching for an SSN” 


1-5 Searching Tor an 5SN 


Problem 


You need a regular expression to match a Social Security number. 


Solution 


In the US these numbers are nine digits long, typically grouped as three 
digits, then two digits, then a final four digits (e.g., 123-45-6789). Sometimes 
they are written without hyphens, so you need to make hyphens optional in 
the regular expression: 


grep '[0-9]\{3\}-\{O,1\FLO-9]\{2\}-\{O,1\FL0-9]\{4\}' datafile 
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You should be able to adapt this to other countries as needed, or consult one 
of the books we reference at the end of this recipe. 


Discussion 


These kinds of regular expressions are often jokingly referred to as write-only 
expressions, meaning that they can be difficult or impossible to read. We’ Il 
take this one apart to help you understand it. In general, though, in any bash 
script that you write using regular expressions, be sure to put comments 
nearby explaining what you intend the regular expression to match. 


Adding some spaces to the regular expression would improve its readability, 
making visual comprehension easier, but it would also change the meaning— 
it would say that we’d need to match space characters at those points in the 
expression. Ignoring that for the moment, let’s insert some spaces into the 
previous regular expression so that we can read it more easily: 


[O-9]\{3\} -\{O,1\} [O-9]\{2\} -\{0,1\} [O-9]\{4\} 


The first grouping says “any digit” then “exactly 3 times.” The next grouping 
says “a dash” then “0 or 1 time.” The third grouping says “any digit” then 
“exactly 2 times.” The next grouping says “a dash” then “0 or 1 time.” The 
last grouping says “any digit” then “exactly 4 times.” 


See Also 


m man regex (Linux, Solaris, HP-UX) or man re_format (BSD, Mac) for 
the details of your regular expression library 


= Classic Shell Scripting by Nelson H. F. Beebe and Arnold Robbins 
(O’Reilly), Section 3.2, for more about regular expressions and the tools 
that use them 


= Mastering Regular Expressions, 3rd Edition, by Jeffrey E. F. Friedl 
(O’ Reilly) 


= Regular Expressions Cookbook, 2nd Edition, by Jan Goyvaerts and Steven 
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Levithan (O’ Reilly) 


= Recipe 9.5, “Finding Files Irrespective of Case” 


7.9 Grepping Compressed Files 


Problem 


You need to grep some compressed files. Do you have to uncompress them 
first? 


Solution 
Not if you have zgrep, zcat, or gzcat on your system. 


zgrep is simply a grep that understands various compressed and 
uncompressed file types (which types are understood varies from system to 
system). You will commonly run into this when searching syslog messages 
on Linux, since the log rotation facilities leave the current logfile 
uncompressed (so it can be in use), but gzip archival logs: 


zgrep ‘search term' /var/log/messages* 


zcat 1s simply a cat that understands various compressed and uncompressed 
files (which types are understood varies from system to system). It might 
understand more formats than zgrep, and it might be installed on more 
systems by default. It is also used in recovering damaged compressed files, 
since it will simply output everything it possibly can, instead of erroring out 
as gunzip or other tools might: 


zcat /var/log/messages.1.gz 


gzcat is similar to zcat, the differences having to do with commercial versus 
free Unix variants, and backward compatibility. 
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Discussion 


The /ess utility may also be configured to transparently display various 
compressed files, which is very handy. See Recipe 8.15. 


See Also 


= Recipe 8.6, “Compressing Files” 
= Recipe 8.7, “Uncompressing Files” 


= Recipe 8.15, “Doing More with less” 


7.10 Keeping Some Output, Discarding the Rest 


Problem 


You need a way to keep some of your output and discard the rest. 
Solution 
The following code prints the first word of every line of input: 

awk '{print $1}' myinput.file 
Words are delineated by whitespace. The awk utility reads data from the 
filename supplied on the command line, or from standard input if no filename 
is given. Therefore, you can redirect the input from a file, like this: 

awk '{print $1}' < myinput.file 


or even from a pipe, like this: 


cat myinput.file | awk '{print $1}' 
Discussion 
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The awk program can be used in several different ways. Its easiest, simplest 
use is just to print one or more selected fields from its input. 


Fields are delineated by whitespace (or specified with the -F option and are 
numbered starting at 1. The field $0 represents the entire line of input. 


awk is a complete programming language; awk scripts can become extremely 
complex. This is only the beginning. 


See Also 

= Recipe 8.4, “Cutting Out Parts of Your Output” 

= Recipe 13.13, “Isolating Specific Fields in Data” 

m man awk 

a http://www.faqs.org/faqs/computer-lang/awk/faq/ 

a Effective awk Programming, 4th Edition, by Arnold Robbins (O’Reilly) 


a sed & awk, 2nd Edition, by Arnold Robbins and Dale Dougherty 
(O’Reilly) 


7.11 Keeping Only a Portion of a Line of 
Output 


Problem 


You want to keep only a portion of a line of output, such as just the first and 
last words. For example, you would like /s to list just filenames and 
permissions, without all of the other information provided by ls -l. 
However, you can’t find any options to /s that would limit the output in that 
way. 


Solution 
Pipe /s into awk, and just pull out the fields that you need: 
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$ ls -l | awk '{print $1, SNF}' 
total 151130 

-rw-r--r-- add.1 

drwxr-xr-x art 

drwxr-xr-x bin 

-fw-r--r-- BuddyIcon.png 
drwxr-xr-x CDs 

drwxr-xr-x downloads 
drwxr-sr-x eclipse 


$ 


Discussion 


Consider the output from the ls -l command. One line of it looks like this: 
drwxr-xr-x 2 username group 176 2006-10-28 20:09 bin 


so it is convenient for awk to parse (by default, whitespace delineates fields 
in awk). The output from ls -l has the permissions as the first field and the 
filename as the last field. 


We use a bit of a trick to print the filename. Since the various fields are 
referenced in awk using a dollar sign followed by the field number (e.g., $1, 
$2, $3), and since awk has a built-in variable called NF that holds the number 
of fields found on the current line, $NF always refers to the last field. (For 
example, the /s output line has eight fields, so the variable NF contains 8, so 
SNF refers to the eighth field of the input line, which in our example is the 
filename.) 


Just remember that you don’t use a $ to read the value of an awk variable 
(unlike bash variables). NF is a valid variable reference by itself. Adding a $ 
before it changes its meaning from “the number of fields on the current line” 
to “the last field on the current line.” 


See Also 
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m man awk 
= /Attp://www.faqs.org/faqs/computer-lang/awk/faq/ 
a Effective awk Programming, 4th Edition, by Arnold Robbins (O’Reilly) 


a sed & awk, 2nd Edition, by Arnold Robbins and Dale Dougherty 
(O’Reilly) 


7.12 Reversing the Words on Each Line 


Problem 


You want to print the input lines with words in the reverse order. 


Solution 
$ awk '{ 
> for (i=NF; i>=0; i--) { 
> printf "%s ", $i; 
> } 
> printf "\n" 


> }' <filename> 


You don’t type the > characters; the shell will print those as a prompt to say 
that you haven’t ended your command yet (it is looking for the matching 
single-quote mark). Because the awk program is enclosed in single quotes, 
the bash shell lets us type multiple lines, prompting us with the secondary 
prompt > until we supply the matching end quote. We spaced out the program 
for readability, even though we could have stuffed it all onto one line like 
this: 


$ awk '{for (i=NF; i>=0; i--) {printf "%s ", $i;} printf "\n" 


}'<filename> 


Discussion 
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The awk language has syntax for a for loop, very much like C. It even 
supports a printf mechanism for formatted output, again modeled after the C 
version (the bash version, too. We use the for loop to count down from the 
last to the first field, and print each field as we go. We deliberately don’t put 
a \n on that first printf because we want to keep the several fields on the 
same line of output. When the loop is done, we add a newline to terminate the 
line of output. 


The reference to $i is very different in awk compared to bash. In bash, when 
we write $i we are getting at the value stored in the variable named i. But in 
awk, as with most programming languages, we simply reference the value in 
i by naming it—that is, by just writing i. So what is meant by $i in awk? 
The value of the variable i is resolved to a number, and then the dollar- 
number expression is understood as a reference to a field (or word of input— 
that is, the ith field. So as i counts down from the last field to the first, this 
loop will print the fields in that reversed order. 


See Also 

m man printf(1) 

m man awk 

= /Attp://www.faqs.org/faqs/computer-lang/awk/faq/ 

a Effective awk Programming by Arnold Robbins (O’ Reilly) 

=m sed & awk by Arnold Robbins and Dale Dougherty (O’ Reilly) 
a “printf’ in Appendix A 


7.13 Summing a List of Numbers 


Problem 


You need to sum a list of numbers, including numbers that don’t appear on 
lines by themselves. 
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Solution 


Use awk both to isolate the field to be summed and to do the summing. Here 
we’ll sum up the numbers that are the file sizes from the output of an ls -l 
command: 


ls -l | awk '{sum += $5}; END {print sum}' 


Discussion 


We are summing up the fifth field of the Ls -l output. The output of ls -l 
looks like this: 


-fw-r--r-- 1 albing users 267 2005-09-26 21:26 Lilmax 


The fields are: permissions, links, owner, group, size (in bytes), last 
modification date, time of modification, and filename. We’re only interested 
in the size, so we use $5 in our awk program to reference that field. 


We enclose the two bodies of our awk program in braces ({}); note that there 
can be more than one body (or block) of code in an awk program. A block of 
code preceded by the literal keyword END is only run once, when the rest of 
the program has finished. Similarly, you can prefix a block of code with 
BEGIN and supply some code that will be run before any input is read. The 
BEGIN block is useful for initializing variables, and we could have used one 
here to initialize sum, but awk guarantees that variables will start out empty. 


If you look at the output of an ls -l command, you will notice that the first 
line is a total, and doesn’t fit our expected format for the other lines. 

We have two choices for dealing with that. First, we can pretend it’s not 
there, which is the approach taken in the preceding solution. Since that 
undesired line doesn’t have a fifth field, our reference to $5 will be empty, 
and our sum won’t change. 


The more conscientious approach would be to eliminate that line. We could 
do so before we give the output to awk by using grep: 


256 


ls -L | grep -v '^total' | awk '{sum += $5}; END {print sum}' 
or we could do a similar thing within awk: 
ls -L | awk '/4total/{next} {sum += $5}; END {print sum}' 


The “total is a regular expression (regex); it means “the letters t-o-t-a-l 
occurring at the beginning of a line” (the leading ^ anchors the search to the 
beginning of a line). For any line of input matching that regex, the associated 
block of code will be executed. The second block of code (the sum) has no 
leading text, the absence of which tells awk to execute it for every line of 
input (meaning this will happen regardless of whether the line matches the 
regex). 

Now, the whole point of adding the special case for “total” was to exclude 
such a line from our summing. Therefore, in the “total block we add a next 
command, which ends processing on this line of input and starts over with the 
next line of input. Since that next line of input will not begin with “total”, 
awk will execute the second block of code with this new line of input. We 
could also have used a getline in place of the next command. getline 
does not rematch all the patterns from the top, only the ones from there on 
down. Note that in awk programming, the order of the blocks of code matters. 


See Also 


m man awk 
= /ttp://www.faqs.org/faqs/computer-lang/awk/faq/ 
= Effective awk Programming, 4th Edition, by Arnold Robbins (O’Reilly) 


=m sed & awk, 2nd Edition, by Arnold Robbins and Dale Dougherty 
(O’Reilly) 


7.14 Counting String Values with awk 
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Problem 


You need to count all the occurrences of several different strings, including 
some strings whose values you don’t know beforehand. That is, you’re not 
trying to count the occurrences of a predetermined set of strings. Rather, you 
are going to encounter some strings in your data and you want to count these 
as-yet-unknown strings. 


Solution 


Use awk’s associative arrays (also known as hashes or dictionaries in other 
languages for your counting. 


For our example, we’ll count how many files are owned by various users on 
our system. The username shows up as the third field in ls -l output, so 
we'll use that field ($3 as the index of the array and increment that member 
of the array (see Example 7-1. 


Example 7-1. ch07/asar.awk 


#!/usr/bin/awk -f 

# cookbook filename: asar.awk 

# Associative arrays in Awk 

# Usage: ls -IR /usr/local | asar.awk 


NF >7 { 
user[ $3 |++ 
} 
END { 
for (i in user) { 
printf "%s owns %d files\n", i, user[i] 
} 
} 


We invoke awk a bit differently here. Because this awk script is a bit more 
complex, we’ve put it in a separate file. We use the -f option to tell awk 
where to get the script file just for fun, but we could have used a 
#!/usr/bin/awk shebang line in the script itself too: 


$ ls -lIR /usr/local | awk -f asar.awk 
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bin owns 68 files 
albing owns 1801 files 
root owns 13755 files 
man owns 11491 files 


$ 


Discussion 


We use the condition NF > 7 as a qualifier to part of the awk script to weed 
out the lines that do not contain filenames, which appear in the Ls -lR output 
and are useful for readability—they include blank lines to separate different 
directories as well as total counts for each subdirectory. Such lines don’t have 
as many fields (or words). The expression NF > 7 that precedes the opening 
brace is not enclosed in slashes, which is to say that it is not a regular 
expression. It’s a logical expression, much like you would use in an if 
statement, and it evaluates to true or false. The NF variable is a special built-in 
variable that refers to the number of fields for the current line of input. So, 
only if a line of input has more than seven fields (words of text) will it be 
processed by the statements within the braces. 


The key line, however, is this one: 
user[$3 ]++ 


Here, the username (e.g., bin) is used as the index to the array. It’s called an 
associative array because a hash table (or similar mechanism) is being used 
to associate each unique string with a numerical value. awk is doing all that 
work for you behind the scenes; you don’t have to write any string 
comparisons or lookups and such. 


Once you’ve built such an array, it might seem difficult to get the values back 
out. For this, awk has a special form of the for loop. Instead of the numeric 
for(i=0; i<max; i++) that awk also supports, there is a particular syntax 
for associative arrays: 


for (i in user) 
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In this expression, the variable i will take on successive values (in no 
particular order from the various values used as indexes to the array user. In 
our example, this means that i will take on the values (bin, albing, root, 
and man, one in each iteration of the loop. If you haven’t seen associative 
arrays before, then we hope that you’re surprised and impressed. This is a 
very powerful feature of awk (and Perl. 


See Also 


m man awk 
€ /Attp://www.faqs.org/faqs/computer-lang/awk/faq/ 
= Effective awk Programming, 4th Edition, by Arnold Robbins (O’Reilly) 


=m sed & awk, 2nd Edition, by Arnold Robbins and Dale Dougherty 
(O’Reilly) 


= Recipe 7.15, “Counting String Values with bash” 


7.15 Counting String Values with bash 


Problem 


You need to count all the occurrences of several different strings, including 
some strings whose values you don’t know beforehand. That is, you’re not 
trying to count the occurrences of a predetermined set of strings. Rather, you 
are going to encounter some strings in your data and you want to count these 
as-yet-unknown strings. 


Solution 


If you are using version 4.0 or newer, use bash’s associative arrays (also 
known as hashes or dictionaries in other languages) for your counting. 


For our example, we’ll count how many files are owned by various users on 
our system. The username shows up as the third field in Ls -l output, so 
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we'll use that value ($3 as the index of the array, and increment that member 
of the array: 


Example 7-2. ch07/cnt_owner.sh 


# # cookbook filename: cnt_owner 
# count owners of a file using bash 
# pipe "Ls -L" into this script 


declare -A AACOUNT 
while read -a LSL 


do 
# only consider Lines that are 7 words or longer 
if CC Ş{#LSL[*]} = 7 2) # the size of the array 
then 
NDX=${LSL[3]} # string assign 
(( AACOUNT[S{NDX}] += 1 )) # math increment 
fi 
done 
for VALS in "S${!AACOUNT[@]}" # index of each element 
do 
echo SVALS "owns" S{AACOUNT[SVALS]} "files" 
done 


We can invoke the program as follows with the results as shown: 


$ ls -lIR /usr/local | bash cnt_owner.sh 
bin owns 68 files 

root owns 13755 files 

man owns 11491 files 

albing owns 1801 files 

$ 


Discussion 


The read -a LSL reads a line at a time, and each word (delineated by 
whitespace) is assigned to an entry in the array LSL. We check to see how 
many words were read by checking the size of the array to weed out the lines 
that do not contain filenames. Such lines are part of the Ls -1R output and 
are usually useful for readability because they include blank lines to separate 
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different directories as well as total counts for each subdirectory. They don’t 
have useful information for our script, but fortunately such lines don’t have 
as many fields (or words as the lines we want. 


Only for lines with at least seven words do we take the third word, which 
should be the owner of the file, and use that as an index to our associative 
array. With standard arrays, such as LSL, each element is referred to by its 
index and that index is an integer. With an associative array, however, the 
index can be a string. 


To print out the results we need to loop over the list of index values that were 
used with this array. The construct "${AACOUNT[@]}" would generate a list 
of all the values in the array, but add the “bang”—"${ ! AACOUNT[@] }'""—and 
you get a list of all the index values used with this array. 


Note that the output is in no particular order (it’s related to the internals of the 
hashing algorithm. If you want it sorted by name or by number of files, then 
pipe this result into the sort command. 


See Also 


= Recipe 7.14, “Counting String Values with awk” 
= Recipe 7.16, “Showing Data as a Quick and Easy Histogram” 


7.16 Showing Data as a Quick and Easy 
Histogram 


Problem 


You need a quick screen-based histogram of some data. 


Solution 


Use the associative arrays of awk, as discussed in Recipe 7.14 (see 
Example 7-3). 
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Example 7-3. chO7/hist.awk 


#!/usr/bin/awk -f 

# cookbook filename: hist.awk 

# Histograms in Awk 

# Usage: ls -IR /usr/local | hist.awk 


function max(arr, big) 


{ 
big = 0; 
for (i in user) 
{ 
if (user[i] > big) { big=user[i];} 
} 
return big 
} 
NF>7{ 
user[$3]++ 
} 
END { 
# for scaling 
maxm = max(user); 
for (i in user) { 
#printf "%s owns %d files\n", i, user[i] 
scaled = 60 * user[i] / maxm ; 
printf "%-10.10s [%8d]:", i, user[i] 
for (i=0; i<scaled; i++) { 
printf "#"; 
} 
printf "\n"; 
} 
J 


When we run it with the same input as Recipe 7.14, we get: 


$ ls -IR /usr/local | awk -f hist.awk 


bin [ 68]: # 

albing [ 1801] : #HHHHHH 

root [ 137555 | : SRE EE 
man [ 11491] : ARE A 


$ 
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Discussion 


We could have put the code for max as the first code inside the END block, but 
we wanted to show you that you can define functions in awk. We are using a 
fancier printf statement here. The string format %-10.10s will left-justify 
and pad to 10 characters but also truncate at 10 characters. The integer format 
%8d will assure that the integer is printed in an 8-character field. This gives 
each histogram the same starting point, by using the same amount of space 
regardless of the username or the size of the integer. 


Like all arithmetic in awk, the scaling calculation is done with floating-point 
numbers unless we explicitly truncate the result with a call to the built-in 
int() function. We don’t do so, which means that the for loop will execute 
at least once, so that even the smallest amount of data will still display a 
single hash mark. 


The data returned from the for (i in user+ loop is in no particular order, 
probably based on some convenient ordering of the underlying hash table. If 
you wanted the histogram displayed in a sorted order, either numeric by 
count or alphabetical by username, you would have to add some sorting. One 
way to do this is to break this program apart into two pieces, sending the 
output from the first part into the sort command and then piping that output 
into the second piece to print the histogram. 


See Also 


m man awk 
a /Attp://www.faqs.org/faqs/computer-lang/awk/faq/ 
a Effective awk Programming, 4th Edition, by Arnold Robbins (O’Reilly) 


= sed & awk, 2nd Edition, by Arnold Robbins and Dale Dougherty 
(O’ Reilly) 


= Recipe 7.14, “Counting String Values with awk” 
= Recipe 7.17, “An Easy Histogram with bash” 
= Recipe 8.1, “Sorting Your Output” 
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7.17 An Easy Histogram with bash 


Problem 


You’d like to use bash rather than an external program to compute and draw 
your histogram. Is that possible? 


Solution 


Yes, thanks to associative arrays. They are available in versions of bash from 
4.0 onward. Based on the code for counting strings (Recipe 7.15), the 
difference is only in the output section. First we make a pass over the values 
to find the largest value, so that we can scale our output to fit on the page: 


BIG=0 
for VALS in "S{!UCOUNT[@]}" 
do 
if (( UCOUNT[SVALS] > BIG )) ; then BIG=S{UCOUNT[SVALS]} ; fi 
done 


With a maximum value (in BIG), we output a line for each entry in the array: 


# 

# print the histogram 

# 

for VALS in "S{!UCOUNT[@] }" 

do 
printf "%-9.9s [%7d]:" SVALS S{UCOUNT[SVALS] } 
# scale to the max value (BIG); N.B. integer / 
SCALED=$(( ( (59 * UCOUNT[SVALS]) / BIG) +1 )) 
for ((i=0; i<SCALED; i++)) { 

printf "#" 

} 
printf "\n" 

done 

Discussion 
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As in Recipe 7.15, the construct "${!UCOUNT[@]" is crucial. It evaluates to a 
list of index values used on the array (in this case, the array UCOUNT. The for 
loop takes each value one at a time and uses it as the index into the array to 
get the count for that user. 


We scale it to 59 spaces and then add 1 so that any nonzero value will have at 
least one mark on the histogram. This isn’t a problem in the awk version 
(Recipe 7.16 because awk uses floating-point math, but the bash version 

uses integer math so anything too small may end up as 0 after the division. 


See Also 


= Recipe 7.15, “Counting String Values with bash” 
= Recipe 7.16, “Showing Data as a Quick and Easy Histogram” 


7.18 Showing a Paragraph of Text After a Found 
Phrase 


Problem 


You are searching for a phrase in a document, and want to show the 
paragraph after the found phrase. 


Solution 


We’re assuming a simple text file, where paragraph means all the text 
between blank lines, so the occurrence of a blank line implies a paragraph 
break. Given that, it’s a pretty short awk program: 


$ cat para.awk 
/keyphrase/ { flag=1 } 
flag == 1 { print } 
[*S/ { flag=0 } 


S awk -f para.awk < searchthis.txt 
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Discussion 


There are just three simple code blocks. The first is invoked when a line of 
input matches the regular expression (here just the word “keyphrase”. If 
“keyphrase” occurs anywhere within a line of input, that is a match and this 
block of code will be executed. All that happens in this block is that the flag 
is set. 


The second code block is invoked for every line of input, since there is no 
regular expression preceding its open brace. Even the input that matches 
“keyphrase” will also be applied to this code block (if we didn’t want that 
effect, we could use a next statement in the first block. All this second block 
does is print the entire input line, but only if the flag is set. 


The third block has a regular expression that, if satisfied, will simply reset 
(turn off the flag. That regular expression uses two characters with special 
meaning: the caret (^, when used as the first character of a regular 
expression, matches the beginning of the line; the dollar sign ($, when used 
as the last character, matches the end of the line. So, the regular expression 
“$ matches an empty line, with no characters between the beginning and end 
of the line. 


We could have used a slightly more complicated regular expression for an 
empty line to let it handle any line with just whitespace rather than a 
completely blank line. That would make the third line look like this: 


/4[[:blank:]]*$/ { flag=0 } 
Perl programmers love the sort of problem and solution discussed in this 
recipe, but we’ve implemented it with awk because Perl is (mostly) beyond 


the scope of this book. If you know Perl, by all means use it. If not, awk 
might be all you need. 


See Also 


m man awk 
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a /ttp://www.faqs.org/faqs/computer-lang/awk/faq/ 
m Effective awk Programming, 4th Edition, by Arnold Robbins (O’Reilly) 


= sed & awk, 2nd Edition, by Arnold Robbins and Dale Dougherty 
(O’Reilly) 
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Chapter 8. Intermediate Shell 
Tools II 


This chapter introduces some more useful utilities that are not part of the 
shell but are used in so many shell scripts that you really should know about 
them. 


Sorting is such a common task, and so useful for readability reasons, that it’s 
good to know about the sort command. In a similar vein, the tr command will 
translate or map from one character to another, or even just delete characters. 


One common thread here is that these utilities are written not just as 
standalone commands, but also as filters that can be included in a pipeline of 
commands. These sorts of commands will typically take one to many 
filenames as parameters (or arguments), but in the absence of any filenames 
they will read from standard input. They also write to standard output. That 
combination makes it easy to connect to the commands with pipes, as in 
something | sort | even more. 


This makes them especially useful, and avoids the clutter and confusion of a 
myriad of temporary files. 


8.1 Sorting Your Output 


Problem 


You would like output in a sorted order, but you don’t want to write (yet 
again) a custom sort function for your program or shell script. Hasn’t this 
been done already? 


Solution 
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Use the sort utility. You can sort one or more files by putting the filenames 
on the command line: 


sort filei.txt file2.txt myotherfile. xyz 


With no filenames on the command line, sort will read from standard input, 
so you can pipe the output from a previous command into sort: 


somecommands | sort 


Discussion 


It can be handy to have your output in sorted order, and handier still not to 
have to add sorting code to every program you write. The shell’s piping 
allows you to hook up sort to any program’s standard output. 


There are many options to sort, but two of the three most worth remembering 
are: 


sort -r 


to reverse the order of the sort (where, to borrow a phrase, the last shall be 
first and the first, last) and: 


sort -f 


to “fold” lower- and uppercase characters together; i.e., to ignore the case 
differences. This can be done either with the -f option or with a GNU long- 
format option: 


sort --ignore-case 


We decided to keep you in suspense, so see Recipe 8.2 for the third-coolest 
sort option. 
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See Also 


=m man sort 


= Recipe 8.2, “Sorting Numbers” 


8.2 Sorting Numbers 


Problem 


When sorting numeric data, you notice that the order doesn’t seem right: 


$ sort somedata 
2 

200 

21 

250 

$ 


Solution 


You need to tell sort that the data should be sorted as numbers. Specify a 
numeric sort with the -n option: 


$ sort -n somedata 
2 

21 

200 


Discussion 


There is nothing wrong with the original (if odd) sort order if you realize that 
it is an alphabetic sort on the data (1.e., 21 comes after 200 because 1 comes 
after 0 in an alphabetic sort). Of course, what you probably want is numeric 
ordering, so you need to use the -n option. 
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sort -rncan be very handy in giving you a descending frequency list of 
something when combined with uniq -c. For example, let’s display the most 
popular shells on this system: 


$ cut -d':' -f7 /etc/passwd | sort | uniq -c | sort -rn 
20 /bin/sh 
10 /bin/false 
2 /bin/bash 
1 /bin/sync 


cut -d':' -f7 /etc/passwd isolates the shell from the /etc/passwd file. 
Then we have to do an initial sort so that uniq will work. uniq -c counts 
consecutive, duplicate lines, which is why we need the presort. Then sort - 
rn gives us a reverse numerical sort, with the most popular shell at the top. 


If you don’t need to count the occurrences and just want a unique list of 
values—.e., if you want sort to remove duplicates—then you can use the -u 
option on the sort command (and omit the unig command). So, to find just 
the list of different shells on this system: 


cut -d':' -f7 /etc/passwd | sort -u 


See Also 


m man sort 
m man uniq 


m man cut 


8.3 Sorting IP Addresses 


Problem 


You want to sort a list of numeric IP address, but you’d like to sort by the last 


272 


portion of the number or by the entire address logically. 


Solution 


To sort by the last octet only (old syntax: 


$ sort -t. -n +3.0 ipaddr.list 


10.0.0.2 
192.168.0.2 
192.168.0.4 
10.0.0.5 
192.168.0.12 
10.0.0.20 

$ 


To sort the entire address as you would expect (POSIX syntax): 


$ sort -t . -k 1,1n -k 2,2n -k 3,3n -k 4,4n ipaddr.list 
10.0.0.2 

10.0.0.5 

10.0.0.20 

192.168.0.2 

192.168.0.4 

192.168.0.12 

$ 


Discussion 


We know this is numeric data, so we use the -n option. The -t option 
indicates the character to use as a separator between fields (in our case, a 
period) so that we can also specify which fields to sort first. In the first 
example, we start sorting with the third field (zero-based) from the left, and 
the very first character (again, zero-based) of that field, so +3. 0. 


In the second example, we used the new POSIX specification instead of the 
traditional (but obsolete) +pos1 -pos2 method. Unlike the older method, it is 
not zero-based, so fields start at 1: 
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sort -t . -k 1,1n -k 2,2n -k 3,3n -k 4,4n ipaddr. list 
Wow, that’s ugly. Here it is in the old format, which is just as bad: 
sort -t. +0n -1 +1n -2 +2n -3 +3n -4 


Using -t. to define the field delimiter is the same, but the sort-key fields are 
given quite differently. In this case, -k 1,1n means “start sorting at the 
beginning of field one (1) and (,) stop sorting at the end of field one (1) and 
do a numerical sort (n). Once you get that, the rest is easy. When using more 
than one field, it’s very important to tell sort where to stop. The default is to 
go to the end of the line, which is often not what you want and which will 
really confuse you if you don’t understand what it’s doing. 


TIP 


The order that sort uses is affected by your locale setting. If your results are not 
as expected, that’s one thing to check. 


Your sort order will vary from system to system depending on whether your 
sort command defaults to using a stable sort. A stable sort preserves the 
original order in the sorted data when the sort fields are equal. Linux and 
Solaris do not default to a stable sort, but NetBSD does. And while -S turns 
off the stable sort on NetBSD, it sets the buffer size in other versions of sort. 


Say we have a trivial file like: 


10.0.0.5 # mainframe 
192.168.0.12 # speedy 
10.0.0.20 # lanyard 
192.168.0.4 # office 
10.0.0.2 # sluggish 
192.168.0.2 # laptop 


If we run this sort command on a Linux or Solaris system: 
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sort -t. -k4n ipaddr.list 
or this command on a NetBSD system: 
sort -t. -S -k4n ipaddr.list 


we will get the data sorted as shown in the first column of Table 8-1. Remove 


the -S on a NetBSD system, and sort will produce the ordering as shown in 
the second column. 


Table 8-1. Sort ordering comparison of Linux, Solaris, and NetBSD 


Linux and Solaris (default) and NetBSD stable (default) sort 
NetBSD (with -S) ordering 

10.0.0.2 # sluggish 192.168.0.2 # Laptop 

192.168.0.2 # laptop 10.0.0.2 # sluggish 

10.0.0.4 # mainframe 192.168.0.4 # office 

192.168.0.4 # office 10.0.0.4 # mainframe 
192.168.0.12 # speedy 192.168.0.12 # speedy 

10.0.0.20 # Lanyard 10.0.0.20 # Lanyard 


If our input file, ipaddr.list, had all the 192.168 addresses first, followed by 
all the 10. addresses, then the stable sort would leave the 192.168 address 
first when there is a tie—that is, when two elements in our sort have the same 
value. We can see in Table 8-1 that this situation exists for Laptop and 
sluggish, since each has a 2 as its fourth field, and also for mainframe and 


office, which tie with 4. In the default Linux sort (and NetBSD with the -S 
option specified), the order is not guaranteed. 


To get back to something easy, and just for practice, let’s sort by the text in 
our IP address list. This time we want our separator to be the # character and 
we want an alphabetic sort on the second field, so we get: 
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$ sort -t'#' -k2 ipaddr.list 


10.0.0.20 # Lanyard 
192.168.0.2 # Laptop 
10.0.0.5 # mainframe 
192.168.0.4 # office 
10.0.0.2 # sluggish 
192.168.0.12 # speedy 

$ 


The sorting will start with the second key and, in this case, go through the 
end of the line. With just the one separator (#) per line, we didn’t need to 
specify the ending, though we could have written -k2,2. 


See Also 


=m man sort 


= ./functions/inetaddr, as provided in the bash tarball (Appendix B) 


8.4 Cutting Out Parts of Your Output 


Problem 


You need to look at only part of your fixed-width or column-based data. 
You'd like to take a subset of it, based on the column position. 


Solution 


Use the cut command with the -c option to take particular columns:! 


$ ps -l | cut -c12-15 
PID 

5391 

7285 

7286 

$ 
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or: 


$ ps -elf | cut -c58- 
(output not shown) 


$ 


Discussion 


With the cut command we specify what portion of the lines we want to keep. 
In the first example, we are keeping columns 12 (starting at column 1) 
through 15, inclusive. In the second case, we specify starting at column 58 
but don’t specify the end of the range so that cut will take from column 58 on 
through the end of the line. 


Most of the data manipulation we’ve looked at has been based on fields, 
relative positions separated by characters called delimiters. The cut command 
can do that too, but it is one of the few utilities that yov’ll use with bash that 
can also easily deal with fixed-width, columnar data (via the -c option). 


Using cut to print out fields rather than columns is possible, though it’s more 
limited than other choices such as awk. The default delimiter between fields 
is the tab character, but you can specify a different delimiter with the -d 
option. Here is an example of a cut command using fields: 


cut -d'#' -f2 < ipaddr.list 
and an equivalent awk command: 
awk -F'#' '{print $2}' < ipaddr.list 


You can even use cut to handle nonmatching delimiters by using more than 
one cut. You may be better off using a regular expression with awk for this, 
but sometimes a couple of quick and dirty cut commands are faster to figure 
out and type. 


Here is how you can get the field out from between square brackets. Note that 
the first cut uses a delimiter of open square bracket (-d'[') and field 2 (-f2, 
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starting at 1. Because the first cut has already removed part of the line, the 
second cut uses a delimiter of closed square bracket (-d']' and field 1 (- 
f1: 


$ cat delimited_data 
Line [l1]. 
Line [12]. 
Line [13]. 


$ cut -d'[' -f2 delimited_data | cut -d']' -f1 


11 

12 

13 

$ 
See Also 
= man cut 
=m man awk 


8.5 Removing Duplicate Lines 


Problem 


After selecting and/or sorting some data, you notice that there are many 
duplicate lines in your results. Yov’d like to get rid of the duplicates, so that 
you can see just the unique values. 


Solution 


You have two choices available to you. If you’ve just been sorting your 
output, add the -u option to the sort command: 


somesequence | sort -u 
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If you aren’t running sort, just pipe the output into unig—provided, that is, 
that the output is sorted, so that identical lines are adjacent: 


somesequence | uniq > myfile 


Discussion 


Since unig requires the data to be sorted already, we’re more likely to just 
add the -u option to sort unless we also need to count the number of 


duplicates (-c, see Recipe 8.2) or see only the duplicates (-d), which uniq 
can do. 


WARNING 
Don’t accidentally overwrite a valuable file by mistake; the uniq command is a 
bit odd in its parameters. Whereas most Unix/Linux commands take multiple 
input files on the command line, uniq does not. In fact, the first (nonoption) 
argument is taken to be the (one and only) input file and any second argument, 
if supplied, is taken as the output file. So if you supply two filenames on the 
command line, the second one will get clobbered without warning. 


See Also 


m man sort 
m man uniq 


= Recipe 8.2, “Sorting Numbers” 


8.6 Compressing Files 


Problem 


You need to compress some files and aren’t sure of the best way to do it. 
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Solution 


First, you need to understand that in traditional Unix, archiving (or 
combining and compressing files are two different operations using two 
different tools, while in the DOS and Windows world it’s typically one 
operation with one tool. A “tarball” is created by combining several files 
and/or directories using the far (tape archive command, then compressed 
using the compress, gzip, or bzip2 tools. This results in files like tarball.tar.Z, 
tarball.tar.gz, tarball.tgz, or tarball.tar.bz2, respectively. Having said that, 
many other tools, including zip, are supported. 


In order to use the correct format, you need to understand where your data 
will be used. If you are simply compressing some files for yourself, use 
whatever you find easiest. If other people will need to use your data, consider 
what platform they will be using and what they are comfortable with. 


The Unix traditional tarball was tarball.tar.Z, but gzip is now much more 
common and xz and bzip2 (which offer better compression than gzip are 
gaining ground. There is also a tool question. Some versions of tar allow you 
to use the compression of your choice automatically while creating the 
archive. Others don’t. 


The universally accepted Unix or Linux format would be a tarball.tar.gz 
created like this: 


$ tar cf tarball_name.tar directory_of_files 
$ gzip tarball_name.tar 
$ 


If you have GNU tar, you could use -Z for compress (don’t, this is obsolete), 
-z for gzip (safest), or -j for bzip2 (highest compression). Don’t forget to use 
an appropriate filename, as this is not automatic. For example: 


tar czf tarball_name.tgz directory_of_files 


While tar and gzip are available for many platforms, if you need to share with 
Windows you are better off using zip, which is nearly universal: 
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zip -r zipfile_name directory_of_files 


zip and unzip are supplied by the InfoZip packages on Unix and almost any 
other platform you can possibly think of. Unfortunately, they are not always 
installed by default. Run the command by itself for some helpful usage 
information, since these tools are not like most other Unix tools. And note the 
-L option to convert Unix line endings to DOS line endings, or -Ll for the 
reverse. 


Discussion 


There are far too many compression algorithms and tools to talk about here; 
others include ar, arc, arj, bin, bz2, cab, jar, cpio, deb, hqx, lha, izh, rar, 
rpm, uue, and zoo. 


When using tar, we strongly recommend using a relative directory to store all 
the files. If you use an absolute directory, you might overwrite something on 
another system that you shouldn’t. If you don’t use any directory, you’ ll 
clutter up whatever directory the user is in when they extract the files (see 
Recipe 8.8). The recommended use is the name and possibly version of the 
data you are processing. Table 8-2 shows some examples. 


Table 8-2. Good and bad 
examples of naming files 
for the tar utility 


Good Bad 

/myapp_1.0.1 myapp.c 
myapp.h 
myapp.man 


./bintools /usr/local/bin 


It is worth noting that Red Hat Package Manager (RPM) files are actually 
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CPIO files with a header. You can get a shell or Perl script called rpm2cpio 
to strip that header and then extract the files like this: 


rpm2cpio some.rpm | cpio -i 


Debian’s .deb files are actually ar archives containing gzipped or bzipped tar 
archives. They may be extracted with the standard ar, gunzip, or bunzip2 
tools. 


Many of the Windows-based tools such as WinZip, PKZIP, FilZip, and 7-Zip 
can handle many or all of the formats mentioned here, and more (including 
tarballs and RPMs). 


See Also 


= man tar 

m man gzip 

m man bzip2 

m man compress 

m man zip 

m man rpm 

m man ar 

= man dpkg 

a http:/www.info-zip.org/ 

a http:/rpm5.org/docs/rpm-guide.html#id3049451 
= /ttp://en.wikipedia.org/wiki/Deb_(file_format) 

€ http:/www.rpm.org/ 

= /ttp://en.wikipedia.org/wiki/RPM_ Package Manager 
m Recipe 7.9, “Grepping Compressed Files” 


= Recipe 8.7, “Uncompressing Files” 
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= Recipe 8.8, “Checking a tar Archive for Unique Directories” 


= Recipe 17.3, “Unzipping Many ZIP Files” 


8. 7 Uncompressing Files 


Problem 


You need to uncompress one or more files ending in extensions like .zar, 
tar.gz, .2Z, .tez, .Z, OF .zip. 


Solution 
Figure out what you are dealing with and use the right tool. Table 8-3 maps 
common extensions to programs capable of handling them. The file command 
is helpful here since it can usually tell you the type of a file even if the name 
is incorrect. 

Table 8-3. Common file extensions and compression utilities 

File extension Command 


.tar tar tf (list contents), tar xf (extract) 


.tar.gZ, tgz GNU sar: tar tzf (list contents), tar xzf (extract) 
Else: gunzip file && tar xf file 


.tar.bz2 GNU ar: tar tjf (list contents), tar xjf (extract) 
Else: gunzip2 file && tar xf file 


.tar.Z GNU ar: tar tZf (list contents), tar xZf (extract) 


Else: uncompress file && tar xf file 


.Zip unzip (often not installed by default) 


You should also try the file command: 
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$ file what_is_this.* 
what_is_this.1: GNU tar archive 
what_is_this.2: gzip compressed data, from Unix 


$ gunzip what_is_this.2 
gunzip: what_is_this.2: unknown suffix -- ignored 


$ mv what_is_this.2 what_is_this.2.gz 
$ gunzip what_is_this.2.gz 

$ file what_is_this.2 

what_is_this.2: GNU tar archive 


Discussion 


If the file extension matches none of those listed in Table 8-3 and the file 
command doesn’t help, but you are sure it’s an archive of some kind, then 
you should do a web search for it. 


See Also 
= Recipe 7.9, “Grepping Compressed Files” 


= Recipe 8.6, “Compressing Files” 


8.8 Checking a tar Archive for Unique 
Directories 


Problem 


You want to untar an archive, but you want to know beforehand which 
directories it is going to write into. You can look at the table of contents of 


the tar archive by using tar -t, but the output can be very large and it’s easy 
to miss something. 
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Solution 


Use an awk script to parse off the directory names from the tar archive’s table 
of contents, then use sort -u to leave you with just the unique directory 
names: 


tar tf some.tar | awk -F/ '{print $1}' | sort -u 


Discussion 


The t option will produce the table of contents for the file specified with the 
f option whose filename follows. The awk command specifies a nondefault 
field separator by using -F/ to specify a slash as the separator between fields. 
Thus, the print $1 will print the first directory name in the pathname. 


Finally, all the directory names will be sorted and only unique ones will be 
printed. 


If a line of the output contains a single period then some files will be 
extracted into the current directory when you unpack this tar file, so be sure 
to be in the directory you desire. 


Similarly, if the filenames in the archive are all local and without a leading ./, 
then you will get a list of filenames that will be created in the current 
directory. 


If the output contains a blank line, that means that some of the files are 
specified with absolute pathnames (1.e., beginning with /); again be careful, as 
extracting such an archive might clobber something that you don’t want 
replaced. 


Some tar programs strip the leading / by default (e.g., GNU tar) or 
optionally. That’s a much safer way to create a tarball, but you can’t count on 
that when you are looking at extracting one. 


See Also 


= man tar 
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m man awk 

= Recipe 8.1, “Sorting Your Output” 
= Recipe 8.2, “Sorting Numbers” 

m Recipe 8.3, “Sorting IP Addresses” 


8.9 Translating Characters 


Problem 


You need to convert one character to another in all of your text. 


Solution 


Use the tr command to translate one character to another. For example: 


tr ';' ',' <be.fore >af.ter 


Discussion 


In its simplest form, a tr command replaces occurrences of the first (and 
only) character of the first argument with the first (and only) character of the 
second argument. 


In the example solution, we redirected input from the file named be.fore and 
sent the output into the file named after, and we translated all occurrences of 
a semicolon into a comma. 


Why do we use the single quotes around the semicolon and the comma? 
Well, a semicolon has special meaning to bash, so if we didn’t quote it bash 
would break our command into two commands, resulting in an error. The 
comma has no special meaning, but we quote it out of habit to avoid any 
special meaning we may have forgotten about—it’s safer always to use the 
quotes, as then we never forget to use them when we need them. 


The tr command can do more than one translation at a time if we put the 
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several characters to be translated in the first argument and their 
corresponding resultant characters in the second argument. Just remember, 
it’s a one-for-one substitution. For example: 


tr '3:.!?' ',' <other.punct >commas.all 


will translate all occurrences of the punctuation symbols of semicolon, colon, 
period, exclamation point, and question mark to commas. Since the second 
argument is shorter than the first, its last (and here, its only) character is 
repeated to match the length of the first argument, so that each character has a 
corresponding character for the translation. 


This kind of translation could be done with the sed command, though sed 
syntax is a bit trickier. The tr command is not as powerful, since it doesn’t 
use regular expressions, but it does have some special syntax for ranges of 
characters—and that can be quite useful, as we’ll see in Recipe 8.10. 


See Also 


= man tr 


= Recipe 8.10, “Converting Uppercase to Lowercase” 


8.10 Converting Uppercase to Lowercase 


Problem 


You need to eliminate case distinctions in a stream of text. 


Solution 


You can translate all uppercase characters (A—Z) to lowercase (a—z) using the 
tr command and specifying a range of characters, as in: 


tr 'A-Z' 'a-z' <be.fore >af.ter 
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There is also special syntax in tr for specifying this sort of range for upper- 
and lower-case conversions: 


tr '[:upper:]' '[:lower:]' <be.fore >af.ter 


There are some versions of tr that honor the current locale’s collating 
sequence, and A-Z may not always be the set of uppercase letters in the 
current locale. It’s better to avoid that problem and use [: Lower: ] and 

[: upper: ] if possible, but that does make it impossible to use subranges like 
N-Z and a-m. 


Discussion 


Although tr doesn’t support regular expressions, it does support a range of 
characters. Just make sure that both arguments end up with the same number 
of characters. If the second argument is shorter, its last character will be 
repeated to match the length of the first argument. If the first argument is 
shorter, the second argument will be truncated to match the length of the first. 


Here’s a very simplistic encoding of a text message using a simple 
substitution cypher that offsets each character by 13 places (1.e., ROT13). An 
interesting characteristic of ROT13 is that the same process is used to both 
encipher and decipher the text: 


$ cat /tmp/joke 
Q: Why did the chicken cross the road? 
A: To get to the other side. 


$ tr 'A-Za-z' 'N-ZA-Mn-za-m' < /tmp/joke 

D: Jul qvq gur puvpxra pebff gur ebnq? 

N: Gb trg gb gur bgure fvqr. 

$ tr 'A-Za-z' 'N-ZA-Mn-za-m' < /tmp/joke | tr 'A-Za-z' 'N-ZA-Mn-za-m' 


Q: Why did the chicken cross the road? 
A: To get to the other side. 


See Also 
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= man tr 
= /Attp://en.wikipedia.org/wiki/Rot13 


= Recipe 8.9, “Translating Characters” 


8.11 Converting DOS Files to Linux Format 


Problem 


You need to convert DOS-formatted text files to the Linux format. In DOS, 
each line ends with a pair of characters—the return and the newline. In 
Linux, each line ends with a single newline. So how can you delete that extra 
DOS character? 


Solution 


Use the -d option on tr to delete the character(s) in the supplied list. For 
example, to delete all DOS carriage returns (\r), use the command: 


tr -d '\r' <file.dos >file.txt 


WARNING 
This will delete all \r characters in the file, not just those at the end of a line. 
Typical text files rarely have characters like that inline, but it is possible. You 
may wish to look into the dos2unix and unix2dos programs if you are worried 
about this. 


Discussion 


The ¢r utility has a few special escape sequences that it recognizes, among 
them \r for carriage return and \n for newline. The other special backslash 
sequences are listed in Table 8-4. 
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lable ð-4. Lhe special escape Sequences Of the tr ulllily 


Sequence Meaning 


\000 Character with octal value ooo (1-3 octal digits) 
\\ Backslash character (i.e., escapes the backslash itself) 
\a “Audible” bell, the ASCII BEL character (since “b” was taken for 
backspace) 
\b Backspace 
\f Form feed 
\n Newline 
\r Return 
\t Tab (sometimes called a “horizontal” tab) 
\v Vertical tab 
See Also 
m man tr 


8. 12 Removing Smart Quotes 


Problem 


You want simple ASCII text out of a document in MS Word, but when you 
save it as text some odd characters still remain. 


Solution 
Translate the odd characters back to simple ASCII like this: 


tr '\221\222\223\224\226\227' '\047\047""--' <odd.txt >plain.txt 


290 


Discussion 


Such “smart quotes” come from the Windows-1252 character set, and may 
also show up in email messages that you save as text. 


To clean up such text, we can use the tr command. The 221 and 222 (octal 
curved single quotes will be translated to simple single quotes. We specify 
them in octal (047 to make it easier on us, since the shell uses single quotes 
as a delimiter. The 223 and 224 (octal are opening and closing curved 
double quotes, and will be translated to simple double quotes. The double 
quotes can be typed within the second argument since the single quotes 
protect them from shell interpretation. The 226 and 227 (octal are dash 
characters and will be translated to hyphens (and no, that second hyphen in 
the second argument is not technically needed since tr will repeat the last 
character to match the length of the first argument, but it’s better to be 
specific. 


See Also 


= man tr 


a /ttps://en.wikipedia.org/wiki/Quotation_mark#Curved_quotes_and_ Unicoi 
for way more than you might ever have wanted to know about quotation 
marks and related character set issues 


8. 15 Counting Lines, Words, or Characters in a 
File 


Problem 


You need to know how many lines, words, or characters are in a given file. 


Solution 


Use the we (word count) command in a command substitution. 
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The normal output of wc is something like this: 


$ wc data_file 
5 15 60 data_file 


# Lines only 
$ wc -l data_file 
5 data_file 
# Words only 
$ wc -w data_file 
15 data_file 
# Characters (often the same as bytes) only 
$ wc -c data_file 
60 data_file 
# Note 60B 


$ ls -l data_file 
-rw-r--r-- 1 jp users 60B Dec 6 03:18 data_file 


You may be tempted to just do something like this: 
data_file_lines=$(wc -l "$data_file") 


That won’t do what you expect, since you’ll get something like "5 
data_file" as the value. You may also see: 


data_file_lines=$(cat "Sdata_file" | wc -l) 
Instead, use this to avoid the filename problem without a useless use of cat: 


data_file_lines=$(wc -l < "$data_file") 


Discussion 


If your version of wc is locale-aware, the number of characters will not equal 
the number of bytes in some character sets. 
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See Also 


m man wc 


= Recipe 15.7, “Splitting Output Only When Necessary” 


8.14 Rewrapping Paragraphs 


Problem 


You have some text with lines that are too long or too short, so you'd like to 
rewrap them to be more readable. 


Solution 


Use the fmt command: 
fmt mangled_text 
optionally with a goal and maximum line length: 


fmt 55 60 mangled_text 


Discussion 


One tricky thing about fmt is that it expects blank lines to separate headers 
and paragraphs. If your input file doesn’t have those blanks, it has no way to 
tell the difference between different paragraphs and extra newlines inside the 
same paragraph—so you will end up with one giant paragraph, with the 
correct line lengths. 


The pr command might also be of some interest for formatting text. 


See Also 


= man fmt 
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= man pr 


8.15 Doing More with less 


Problem 


“less is more!” You'd like to take better advantage of the features of the less 
pager. 


Solution 


Read the Jess manpage and use the $LESS variable with ~//essfilter and 
~/ lesspipe files. 


less takes options from the $LESS variable, so rather than creating an alias 
with your favorite options, put them in that variable. It takes both long and 
short options, and any command-line options will override options in the 
variable. We recommend using the long options in the $LESS variable since 
they are easy to read. For example: 


export LESS="--LONG-PROMPT --LINE-NUMBERS --ignore-case --QUIET" 


But that is just the beginning. less is expandable via input preprocessors, 
which are simply programs or scripts that preprocess the file that less is about 
to display. This is handled by setting the S$LESSOPEN and $LESSCLOSE 
environment variables appropriately. 


You could build your own, but save yourself some time (after reading the 
following discussion) and look into Wolfgang Friebel’s /esspipe.sh. The 
script works by setting and exporting the $LESSOPEN environment variable 
when run by itself: 


$ ./lesspipe.sh 
LESSOPEN="|./lLesspipe.sh %s" 
export LESSOPEN 

$ 
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So you simply run it in an eval statement, like eval 
$(/path/to/lesspipe.sh) or eval °/path/to/lesspipe.sh’, and then 
use /ess as usual. A partial list of supported formats for version 1.82 is: 


gzip, compress, bzip2, zip, rar, tar, nroff, ar archive, pdf, ps, dvi, shared 
library, executable, directory, RPM, Microsoft Word, OASIS 
(OpenDocument, Openoffice, Libreoffice) formats, Debian, MP3 files, 
image formats (png, gif, jpeg, tiff, ...), utf-16 text, iso images and 
filesystems on removable media via /dev/xxx. 
But there is a catch. These formats require various external tools, so not all 
features in the example /esspipe.sh will work if you don’t have them. The 
package also contains ./configure (or make) scripts to generate a version of 
the filter that will work on your system, given the tools that you have 
available. 


Discussion 


less is unique in that it is a GNU tool that was already installed by default on 
every single test system we tried—every one. Not even bash can say this. 
And version differences aside, it works the same on all of them. Quite a claim 
to fame. 


However, the same cannot be said for /esspipe* and the $LESSOPEN filters. 
We found other versions, with wildly variable capabilities, besides the ones 
listed in the Solution section: 


= Red Hat has a /usr/bin/lesspipe.sh that can’t be used like eval 
‘/path/to/lesspipe.sh°. 


= Debian has a /usr/bin/lesspipe that can be eval’ed and also supports 
additional filters via a ~//essfilter file. 


=» SUSE Linux has a /usr/bin/lessopen.sh that can’t be eval’ed. 
= FreeBSD has a trivial /usr/bin/lesspipe.sh (no eval, .Z, .gz, or .bz2). 
= Solaris, HP-UX, the other BSDs, and macOS have nothing by default. 


To see if you already have one of these, try this on your system. This Debian 
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system has the Debian /esspipe installed but not in use (since $LESSOPEN is 
not defined): 


$ type lesspipe.sh; type Lesspipe; set | grep LESS 
-bash3: type: lesspipe.sh: not found 
lesspipe is /usr/bin/Lesspipe 


This Ubuntu system has the Debian /esspipe installed and in use: 


$ type lesspipe.sh; type Lesspipe; set | grep LESS 
-bash: type: Lesspipe.sh: not found 

lesspipe is hashed (/usr/bin/lesspipe) 
LESSCLOSE='/usr/bin/lesspipe %s %s' 

LESSOPEN='| /usr/bin/lesspipe %s' 

$ 


We recommend that you download, configure, and use Wolfgang Friebel’s 
lesspipe.sh because it’s the most capable. We also recommend that you read 
the less manpage because it’s very interesting. 


See Also 
m man less 
m man lesspipe 


m man lesspipe.sh 


http://www. greenwoodsoftware.com/less/ 


http://www-zeuthen.desy.de/~friebel/unix/lesspipe.html 


1 Note that our example ps command only works with certain systems; e.g., CentOS-4, 
Fedora Core 5, and Ubuntu work, but Red Hat 8, NetBSD, Solaris, and macOS all 
garble the output due to using different columns. 
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Chapter 9. Finding Files: 
find, locate, slocate 


How easy is it for you to search for files throughout your filesystem? 


For the first few files that you created, it was probably easy enough just to 
remember their names and where you kept them. Then when you got more 
files, you created subdirectories (or folders in GUI-speak) to clump your files 
into related groups. Soon there were subdirectories inside of subdirectories, 
and now you are having trouble remembering where you put things. And of 
course, with larger and larger disks it is getting easier to just keep creating 
and never deleting any files (and for some of us, this getting older thing isn’t 
helping either). 


But how do you find that file you were just editing last week? Or the 
attachment that you saved in a subdirectory (which seemed such a logical 
choice at the time)? Or maybe your filesystem has become cluttered with 
MP3 files scattered all over it, and you want to collect them all up. 


Various attempts have been made to provide graphical interfaces to help you 
search for files, which is all well and good—but how do you use the results 
from a GUI-style search as input to other commands? 


bash and the GNU tools can help. They provide some very powerful search 
capabilities that enable you to search by filename, dates of creation or 
modification, even content. They send the results to standard output, perfect 
for use in other commands or scripts. 


So stop your wondering—here’s the information you need. 


9.1 Finding All Your MP3 Files 


Problem 
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You have MP3 audio files scattered all over your filesystem. You’d like to 
move them all into a single location so that you can organize them and then 
copy them onto a music player. 


Solution 


The find utility can locate all of those files and then execute a command to 
move them where you want. For example: 


find . -name '*.mp3' -print -exec mv '{}' ~/songs \; 


Discussion 


The syntax for the find utility is unlike that of other Unix tools. It doesn’t use 
options in the typical way, with dash and single-letter collections up front 
followed by several words of arguments. Rather, the options look like short 
words, and are ordered in a logical sequence describing the logic of which 
files are to be found, and what to do with them, if anything, when they are 
found. These word-like options are often called predicates. 


A find command’s first arguments are the directory or directories in which to 
search. A typical use is simply (.) for the current directory, but you can 
provide a whole list of directories, or even search the entire filesystem 
(permissions allowing) by specifying the root of the filesystem (/) as the 
starting point. 


In our example the first option (the -name predicate) specifies the pattern we 
will search for. Its syntax is like the bash pattern-matching syntax, so *.mp3 
will match all filenames that end in the characters “.mp3”. Any file that 
matches this pattern is considered to return true and is passed along to the 
next predicate of the command. 


Think of it this way: find will climb around in the filesystem, and each 
filename that it finds it will present to this gauntlet of predicates that must be 
run. Any predicate that is true is passed. Encounter a false, and that 
filename’s turn is immediately over, and the next filename is processed. 
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The -print predicate is easy. It is always true and it has the side effect of 
printing the name to standard output, so any file that has made it this far in 
the sequence of predicates will have its name printed. 

The -exec is a bit odd. Any filename making it this far will become part of a 
command that is executed. The remainder of the line, up to the \;, is the 
command to be executed. The {} is replaced by the name of the file that was 
found. So in our example, if find encounters a file named mhsr.mp3 in the 
./music/jazz subdirectory, then the command that will be executed will be: 


mv ./music/jazz/mhsr.mp3 ~/songs 


The command will be issued for each file that matches the pattern. If lots and 
lots of matching files are found, lots and lots of commands will be issued. 
Sometimes this is too demanding of system resources, and it can be a better 
idea to use find just to find the files and print the filenames into a datafile, 
and issue fewer commands by consolidating arguments several to a line. (But 
with machines getting faster all the time, this is less and less of an issue. It 
might even be something worthwhile for your dual-core or quad-core 
processor to do.) 


See Also 


m man find 

= Recipe 1.5, “Finding and Running Commands” 

= Recipe 1.6, “Getting Information About Files” 

= Recipe 9.2, “Handling Filenames Containing Odd Characters” 


9.2 Handling Filenames Containing Odd 
Characters 


Problem 
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You used a find command like the one in Recipe 9.1, but the results were not 
what you intended because many of your filenames contain odd characters. 


Solution 


First, understand that to Unix folks, odd means “anything not a lowercase 
letter, or maybe a number.” So uppercase letters, spaces, punctuation, and 
character accents are all odd, but you’ll find all of those and more in the 
names of many songs and bands. 


Depending on the oddness of the characters and your system, tools, and goal, 
it might be enough to simply quote the replacement string (1.e., put single 
quotes around the {}, as in '{}' . You did test your command first, right? 


If that’s no good, try using the -print® argument to find and the -0 argument 
to xargs. -printo tells find to use the null character (\0 instead of 
whitespace as the output delimiter between pathnames found. -O then tells 
xargs the input delimiter. These will always work, but they are not supported 
on every system. 


The xargs command takes whitespace-delimited (except when using -0 
pathnames from standard input and executes a specified command on as 
many of them as possible (up to a bit less than the system’s ARG_MAX value; 
see Recipe 15.13. Since there is a lot of overhead associated with calling 
other commands, using xargs can drastically speed up operations because you 
are calling the other command as few times as possible, rather than each time 
a pathname is found. 


So, to rewrite the solution from Recipe 9.1 to handle odd characters: 
find . -name '*.mp3' -printO | xargs -i -0 mv '{}' ~/songs 


Here is a similar example demonstrating how to use xargs to work around 
spaces in a path or filename when locating and then copying files: 


locate P1100087.JPG PC220010.JPG PA310075.JPG PA310076.JPG | xargs -i 
cp '{}'. 
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Discussion 


There are two problems with this approach. One is that not all versions of 
xargs support the -i option, and the other is that the -i option eliminates 
argument grouping, thus negating the speed increase we were hoping for. The 
problem is that the mv command needs the destination directory as the final 
argument, but traditional xargs will simply take its input and tack it onto the 
end of the given command until it runs out of space or input. The results of 
that behavior applied to an mv command would be very, very ugly. Some 
versions of xargs provide a -i switch that defaults to using {} (like find, but 
using -i results in the command being run repeatedly, once for each 
argument. So the only benefit over using find’s -exec is the odd-character 
handling. 


The xargs utility is most effective when used in conjunction with find and a 
command like chmod that just wants a list of arguments to process. You can 
really see a vast speed improvement when handling large numbers of 
pathnames. For example: 


find some_directory -type f -print® | xargs -0 chmod 0644 


See Also 


m man find 


m man xargs 
= Recipe 9.1, “Finding All Your MP3 Files” 
= Recipe 15.13, “Working Around “Argument list too long” Errors” 


9.3 Speeding Up Operations on Found Files 


Problem 


You used a find command like the one in Recipe 9.1, but the resulting 
operations took a long time because you found a lot of files. You want to 
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speed it up. 


Solution 


See the discussion on xargs in Recipe 9.2. 


See Also 
= Recipe 9.1, “Finding All Your MP3 Files” 
= Recipe 9.2, “Handling Filenames Containing Odd Characters” 


9.4 Finding Files Across Symbolic Links 


Problem 


You issued a find command to find your .mp3 files, but it didn’t find all of 
them—it missed all those that were part of your filesystem but were mounted 
via a symbolic link. Is find unable to cross that kind of boundary? 


Solution 


Use the - follow predicate. The example we used in Recipe 9.2 becomes: 


find -L . -name '*.mp3' -printO | xargs -i -0 mv '{}' ~/songs 


Discussion 


Sometimes you don’t want find to cross over onto other filesystems, which is 
where symbolic links originated. So the default for find is not to follow a 
symbolic link. If you do want it to do so, then use the -L option on your find 
command, immediately following the command name and before the 
directory list. 


See Also 
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m man find 


9.5 Finding Files Irrespective of Case 


Problem 


Some of your MP3 files end with .MP3 rather than .mp3. How do you find 
those? 


Solution 


Use the -iname predicate (if your version of find supports it) to run a case- 
insensitive search, rather than just -name. For example: 


find . -follow -iname '*.mp3' -printO | xargs -i -0 mv '{}' ~/songs 


Discussion 


Sometimes you care about the case of the filename and sometimes you don’t. 
Use the -iname option when you don’t care; i.e., in situations like this, where 
.mp3 and .MP3 both indicate that the file is probably an MP3 file. (We say 
probably because on Unix-like systems you can name a file anything that you 
want. It isn’t forced to have a particular extension.) 


One of the most common places where you’ll see the upper- and lowercase 
issue is when dealing with Microsoft Windows—compatible filesystems, 
especially older or “lowest common denominator” filesystems. A digital 
camera that we use stores its files with filenames like PICT001.JPG, 
incrementing the number with each picture. If you were to try: 


find . -name '*.jpg' -print 
you wouldn’t find many pictures. In this case you could also try: 
find . -name '*.[Jj][Pp][Gg]' -print 
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since that regular expression will match either letter in brackets, but that isn’t 
as easy to type, especially if the pattern that you want to match is much 
longer. In practice, using -iname is an easier choice. The catch is that not 
every version of find supports the -iname predicate. If your system doesn’t 
support it, you could try tricky regular expressions as shown here, use 
multiple -name options with the case variations you expect, or install the 
GNU version of find. 


See Also 


m man find 


9.6 Finding Files by Date 


Problem 


Someone sent you a JPEG image file that you saved on your filesystem a few 
months ago. Now you don’t remember where you put it. How can you find 
it? 


Solution 
Use a find command with the -mtime predicate, which checks the date of last 


modification. For example: 


find . -name '*.jpg' -mtime +90 -print 


Discussion 


The -mtime predicate takes an argument to specify the time frame for the 
search. The 90 stands for 90 days. By using a plus sign on the number (+90) 
we indicate that we’re looking for a file modified more than 90 days ago. 
Write -90 (using a minus sign) for /ess than 90 days. Use neither a plus nor a 
minus to mean exactly 90 days. 
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There are several predicates for searching based on file modification times, 
and each takes a quantity argument. Using a plus, minus, or no sign indicates 
greater than, less than, or equal to, respectively, for all of those predicates. 
The find utility also has logical AND, OR, and NOT constructs, so if you 
know that the file was at least one week (7 days but not more than 14 days 
old, you can combine the predicates like this: 


find . -mtime +7 -a -mtime -14 -print 


You can get even more complicated, using OR as well as AND and even 
NOT to combine predicates, as in: 


find . -mtime +14 -name '*.text' -o \( -mtime -14 -name '*.txt' \) - 
print 


This will print out the names of files ending in .text that are older than 14 
days, as well as those that are newer than 14 days but have .txt as their last 4 
characters. 


You will likely need parentheses to get the precedence right. Two predicates 
in sequence are like a logical AND, which binds tighter than an OR (in find 
as in most languages). Use parentheses as much as you need to make it 
unambiguous. 


Parentheses have a special meaning to bash, so we need to escape that 
meaning and write them as \( and \) or inside of single quotes as '(' and 
')'. You cannot use single quotes around the entire expression though, as 
that will confuse the find command. It wants each predicate as its own word. 


See Also 


= man find 


9.7 Finding Files by Type 
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Problem 


You are looking for a directory with the word “java” in its name. When you 
tried: 


find . -name '*java*' -print 


you got way too many files—including all the Java source files in your part 
of the filesystem. 


Solution 


Use the - type predicate to select only directories: 


find . -type d -name '*java*' -print 


Discussion 


We put the -type d first, followed by the -name '*java*'. Either order 
would have found the same set of files, but putting the -type d first in the 
list of predicates makes the search slightly more efficient: as each file is 
encountered, the test will be made to see if it is a directory and then only 
directories will have their names checked against the pattern. All files have 
names; relatively few are directories. So, this ordering eliminates most files 
from further consideration before we ever do the string comparison. Is it a big 
deal? With processors getting faster all the time, it matters less. With disk 
sizes getting bigger all the time, it matters more. There are several types of 
files for which you can check, not just directories. Table 9-1 lists the single 
characters used to find these types of files. 


Table 9-1. Characters 
used by find’s -type 
predicate 


Key Meaning 


b Block special file 
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c Character special file 
d Directory 

p Pipe (or “fifo”) 

f Plain ol’ file 

l Symbolic link 

s Socket 


D Door (Solaris only) 


See Also 


= man find 


9.8 Finding Files by Size 


Problem 


You want to do a little housecleaning, and to get the most out of your effort 
you are going to start by finding your largest files and deciding if you need to 
keep them around. But how do you find your largest files? 


Solution 


Use the -size predicate in the find command to select files above, below, or 
of exactly a certain size. For example: 


find . -size +3000k -print 


Discussion 


Like the numeric argument to -mtime, the -size predicate’s numeric 
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argument can be preceded by a minus sign, a plus sign, or no sign at all to 
indicate less than, greater than, or exactly equal to the numeric argument. In 
our example, we’re looking for files that are greater than the size indicated. 


The size indicated includes a unit of k for kilobytes. If you use c for the unit, 
that means just bytes (or characters. If you use b, or don’t put any unit, that 
indicates a size in blocks. (The block is a 512-byte block, historically a 
common unit in Unix systems. So, we’re looking for files that are greater 
than 3 MB in size. 


TLE 


If you want to delete the files and are using a version of find that supports it, the 
-delete action is much easier than trying to use rm or xargs rm. 


See Also 


= man find 
m man du 


m “Solution” 


9.9 Finding Files by Content 


Problem 


You wrote an important letter and saved it as a text file, putting .txt on the 
end of the filename, but you’ve forgotten the rest of the name. Beyond that, 
the only thing you remember about the content of the letter is that you used 
the word “portend.” How do you find a file with some known content? 


Solution 


If you are in the vicinity of that file, say within the current directory, you can 
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start with a simple grep: 
grep -i portend *.txt 


With the -i option, grep will ignore upper- and lowercase differences. This 
command may not be sufficient to find what you’re looking for, but start 
simply. Of course, if you think the file might be in one of your many 
subdirectories, you can try to reach all the files that are in subdirectories of 
the current directory with this command: 


grep -i portend */*.txt 


Let’s face it, though, that’s not a very thorough search. 


If that doesn’t do it, let’s use a more complete solution: the find command. 
Use the -exec option on find so that if the predicates are true up to that point, 
it will execute a command for each file it finds. You can invoke grep or other 
utilities like this: 


find . -name '*.txt' -exec grep -Hi portend '{}' \; 


Discussion 


We use the -name '*.txt' construct to help narrow down the search. Any 
such test will help, since having to run a separate executable for each file the 
command finds is costly in time and CPU horsepower. Maybe you have a 
rough idea of how old the file is (e.g., -mdate -5 or some such); if so, add 
that too. 


The '{}' is where the filename is put when executing the command. The \; 
indicates the end of the command, in case you want to continue with more 
predicates. Both the braces and the semicolon need to be escaped, so we 
quote one and use the backslash for the other. It doesn’t matter which way we 
escape them, only that we do escape them so that bash doesn’t misinterpret 
them. 
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On some systems, the -H option will print the name of the file if grep finds 
something. Normally, with only one filename on the command, grep won’t 
bother to name the file; it just prints out the matching line that it finds. Since 
we’re searching through many files, we need to know which file was 


grepped. 


If you’re running a version of grep that doesn’t have the -H option, then just 
put /dev/null as one of the filenames on the grep command. The grep 
command will then have more than one file to open, and will print out the 
filename if it finds the text. 


See Also 


= man find 


9.10 Finding Existing Files and Content Fast 


Problem 


You'd like to be able to find files without having to wait for a long find 
command to complete, or you need to find a file with some specific content. 


Solution 


If your system has locate, slocate, Beagle, Spotlight, or some other indexer, 
you are already set. If not, look into them. 


As we discussed in Recipe 1.5, locate and slocate consult database files about 
the system (usually compiled and updated by a cron job to find file or 
command names almost instantly. The location of the actual database files, 
what is indexed therein, and how often may vary from system to system. 
Consult your system’s manpages for details. Here’s an example: 


$ locate apropos 
/usr/bin/apropos 
/usr/share/man/de/mani/apropos.1.gz 
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/usr/share/man/es/mani/apropos.1.gz 
/usr/share/man/it/mani/apropos.1.gz 
/usr/share/man/ja/mani/apropos.1.gz 
/usr/share/man/mani/apropos.1.gz 


locate and slocate don’t index content, though, so see Recipe 9.9 for that. 


Most modern graphical operating systems now include local search tools that 
use an indexer to crawl, parse, and index the names and contents of all of the 
files (and usually email messages) in your personal file space; i.e., your home 
directory on a Unix or Linux system. This information is then almost 
instantly available to you when you look for it. These tools are usually very 
configurable, graphical, and operate on a per-user basis. 


Discussion 


slocate stores permission information (in addition to filenames and paths), so 
it will not list programs to which the user does not have access. On most 
Linux systems locate is a symbolic link to slocate; other systems may have 
separate programs, or may not have s/ocate at all. Both of these are 
command-line tools that crawl and index the entire filesystem, more or less, 
but they only contain filenames and locations. 


See Also 


m man locate 
m man slocate 
= Recipe 1.5, “Finding and Running Commands” 


= Recipe 9.9, “Finding Files by Content” 


9.11 Finding a File Using a List of Possible 
Locations 
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Problem 


You need to execute, source, or read a file, but it could be located in a 
number of different places in or outside of the $PATH. 


Solution 


If you are going to source the file and it’s located somewhere on the $PATH, 
just source it. bash’s builtin source command (also known by the shorter-to- 
type but harder-to-read POSIX name. will search the $PATH if the 
sourcepath shell option is set, which it is by default: 


source myfile 


If you want to execute a file only if you know it exists in the $PATH and is 
executable, and you have bash version 2.05b or higher, use type -P to 
search the $PATH. Unlike the which command, type -P only produces output 
when it finds the file, which makes it much easier to use in this case: 


LS=$(type -P ls) 
[ -x "SLS" ] && SLS 


to Res 


LS=$(type -P ls) 
if [ -x "SLS" ]; then 

: commands involving $LS here 
fi 


If you need to look in a variety of locations, possibly including the $PATH, 
use a for loop. To search each of the elements of the $PATH, use the variable 
substitution operator ${variable//pattern/replacement} to replace all of 
the : separators with a space, thereby rendering them as separate words, and 
then use for as usual to iterate over a list of words. To search the $PATH and 
other possible locations, just list them in the for statement as in these 
examples: 
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for path in S${PATH//:/ }; do 
[ -x "Spath/ls" ] && Spath/1s 
done 


# 0R» 


for path in ${PATH//:/ } /opt/foo/bin /opt/bar/bin; do 
[ -x "$path/ls" ] && $path/ls 
done 


If the file is not in the SPATH but could be in a list of other locations, possibly 
even under different names, list the full paths for each: 


for file in /usr/local/bin/inputrc /etc/inputrc ~/.inputrc; do 

[ -f "$file" ] && bind -f "$file" && break # Use the first one 
found 
done 


Perform any additional tests as needed. For example, you may wish to use 
screen when logging in if it’s present on the system: 


for path in ${PATH//:/ }; do 
if [ -x "Spath/screen" ]; then 
# If screen(1) exists and is executable: 
for file in /opt/bin/settings/run_screen ~/settings/run_screen; 
do 
[ -x "$file" ] && $file && break # Execute the first one 
found 
done 
fi 
done 


See Recipe 16.22 for more details on this code fragment. 


Discussion 


Using for to iterate through each possible location may seem like overkill, 
but it’s actually very flexible and allows you to search wherever you need to, 
apply whatever other tests are appropriate, and then do whatever you want 
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with the file if found. By replacing each : with a space in the $PATH, we turn 
it into the kind of space-delimited list for expects (but as we also saw, any 
space-delimited list will work. Adapting this technique as needed will allow 
you to write some very flexible and portable shell scripts that can be highly 
tolerant of file locations. 


You may be tempted to set SIFS=':' to directly parse the $PATH, rather than 
preparsing it into $path. That will work, but involves extra work with 
variables and isn’t as flexible. 


You may also be tempted to do something like the following: 
[ -n "S(which myfile)" ] && bind -f $(which myfile) 


The problem here is not when the file exists, but when it doesn’t. The which 
utility behaves differently on different systems. The Red Hat which is aliased 
to provide details when the argument is an alias and to set various command- 
line switches, and it returns a not found message (while which on Debian or 
FreeBSD does not). But if you try that line on NetBSD, you could end up 
trying to bind no myfile in /sbin /usr/sbin /bin /usr/bin 
/usr/pkg/sbin /usr/pkg/bin /usr/X11R6/bin /usr/ local/sbin 
/usr/local/bin, which is not what you meant. 


The command command is also interesting in this context. It’s been around 


longer than type -P and may be useful under some circumstances. 


Red Hat Enterprise Linux 4.x behaves like this: 


$ alias which 
alias which='alias | /usr/bin/which --tty-only --read-alias --show-dot 
--show-tilde' 


$ which rd 
alias rd='rmdir' 
/bin/rmdir 


$ which ls 
alias ls='ls --color=auto -F -h' 


/bin/ls 
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$ which cat 

/bin/cat 

$ which cattt 

/usr/bin/which: no cattt in 

(/usr/kerberos/bin: /usr/local/bin: /bin: /usr/bin: /usr/ 
X11R6/bin: /home/jp/bin) 


$ command -v rd 
alias rd='rmdir' 


$ command -v ls 
alias ls='ls --color=auto -F -h' 


$ command -v cat 
/bin/cat 


Debian and FreeBSD (but not NetBSD or OpenBSD) behave like this: 


$ alias which 
-bash3: alias: which: not found 


$ which rd 


$ which ls 
/bin/ls 


$ which cat 
/bin/cat 


$ which cattt 


$ command -v rd 
-bash: command: rd: not found 


$ command -v ls 
/bin/ls 


$ command -v cat 
/bin/cat 
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$ command -v ll 
alias ll='ls -l' 


See Also 

=» help type 

m man which 

=m help source 
m man source 


m Recipe 16.22, “Getting Started with a Custom Configuration” 


Recipe 17.4, “Recovering Disconnected Sessions Using screen” 
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Chapter 10. Additional 
Features for Scripting 


Many scripts are written as simple one-off scripts that are only used by their 
author, consisting of only a few lines—perhaps only a single loop, if that. But 
some scripts are heavy-duty scripts that will see a lot of use from a variety of 
users. Such scripts will often need to take advantage of features that allow for 
better sharing and reuse of code. These advanced scripting techniques can be 
useful for many kinds of scripts, and are often found in larger systems of 
scripts such as the /etc/init.d scripts on many Linux systems. You don’t have 
to be a system administrator to appreciate and use the tips and techniques 
described here. They will prove themselves on any large scripting effort. 


10.1 “Daemen=izife” Your Script 


Problem 


Sometimes you want a script to run as a daemon, in the background and 
never ending. To do this properly you need to be able to detach your script 
from its controlling TTY—that is, from the terminal session used to start the 
daemon. Simply putting an ampersand on the command isn’t enough. If you 
start your daemon script on a remote system via an SSH (or similar) session, 
you'll notice that when you log out, the SSH session doesn’t end and your 
window is hung until that script ends (which, being a daemon, it won’t). 


Solution 


Use the following to invoke your script, run it in the background, and still 
allow yourself to log out: 


nohup mydaemonscript 0<&-1>/dev/null 2>&1 & 
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or: 


nohup mydaemonscript >>/var/log/myadmin.log 2>&1 <&- & 


Discussion 


You need to close the controlling TTY (terminal), which is connected in three 
ways to your (or any) job: via standard input (STDIN), standard output 
(STDOUT), and standard error (STDERR). We can close STDOUT and 
STDERR by pointing them at another file—typically either a logfile, so that 
you can retrieve their output at a later time, or the file /dev/null to throw away 
all their output. We use the redirecting operator > to do this. 

But what about STDIN? The cleanest way to deal with STDIN is to close the 
file descriptor. The bash syntax to do that is like a redirect, but with a dash 
for the file-name (0<&- or <&-). 


We use the nohup command so that the script is run without being interrupted 
by a hangup signal when we log off. 


In the first example, we use the file descriptor numbers (1.e., 0, 1, 2) 
explicitly in all three redirections. They are optional in the case of STDIN 
and STDOUT, so in our second example we don’t use them explicitly. We 
also put the input redirect at the end of the second command rather than at the 
beginning, since the order here is not important. (However, the order is 
important and the file descriptor number is necessary in redirecting 
STDERR.) 


See Also 


= Chapters 2 and 3 for more on redirecting output and input 


10.2 Reusing Code with Includes and Sourcing 


Problem 
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There are a set of shell variable assignments that you would like to have 
common across a set of scripts that you are writing. You tried putting this 
configuration information in its own script, but when you run that script from 
within another script, the values don’t stick; your configuration is running in 
another shell, and when that shell exits, so do your values. Is there some way 
to run that configuration script within the current shell? 


Solution 


Use the bash shell’s source command or POSIX’s single period (. to read in 
the contents of that configuration file. The lines of that file will be processed 
as if encountered in the current script. 


Here’s an example of some configuration data: 


$ cat myprefs.cfg 
SCRATCH_DIR=/var/tmp 
IMG_FMT=png 
SND_FMT=0gg 

$ 


It is just a simple script consisting of three assignments. Here’s another 
script, one that will use these values: 


# use the user prefs 

source SHOME/myprefs.cfg 

cd ${SCRATCH_DIR: -/tmp} 

echo You prefer SIMG_FMT image files 
echo You prefer S$SND_FMT sound files 


Discussion 


The script that is going to use the configuration file uses the source command 
to read in the file. It can also use a dot (.) in place of the word source. A dot 
is easy and quick to type, but hard to notice in a script or screenshot: 


. SHOME/myprefs.cfg 
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You wouldn’t be the first person to look right past the dot and think that the 
script was just being executed. 


Sourcing is both a powerful and a dangerous feature of bash scripting. It 
gives you a way to create a configuration file and then share that file among 
several scripts. With that mechanism, you can change your configuration by 
editing one file, not several scripts. 


The contents of the configuration file are not limited to simple variable 
assignment, however. Any valid shell command is legal syntax, because 
when you source a file like this, it is simply getting its input from a different 
source; it is still the bash shell processing bash commands. Regardless of 
what shell commands are in that sourced file—for example, loops or 
invoking other commands—tt is all legitimate shell input and will be run as if 
it were part of your script. 


Here’s a modified configuration file: 


$ cat myprefs.cfg 
SCRATCH_DIR=/var/tmp 
IMG_FMT=$(cat SHOME/myimage. pref) 
if [ -e /media/mp3 ] 


then 

SND_FMT=mp3 
else 

SND_FMT=0gg 
fi 
echo config file loaded 
$ 


This configuration file is hardly what one thinks of as a passive list of 
configured variables. It can run other commands (e.g., cat) and use if 
statements to vary its choices. It even ends by echoing a message. Be careful 
when you source something, as it’s a wide-open door into your script. 


One of the best uses of sourcing scripts comes when you define bash 
functions (as we will show you in Recipe 10.3). These functions can then be 
shared as a common library of functions among all the scripts that source the 
script of function definitions. 
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See Also 


= The bash manpage for more about readline 
= Recipe 10.3, “Using Configuration Files in a Script” 


= Recipe 10.4, “Defining Functions” 


10.3 Using Configuration Files in a Seript 


Problem 


You want to use one or more external configuration files for one or more 
scripts. 


Solution 


You could write a lot of code to parse some special configuration file format. 
Do yourself a favor and don’t do that. Just make the config file a shell script 
and use the solution in Recipe 10.2. 


Discussion 


This is just a specific application of sourcing a file. However, it’s worth 
noting that you may need to give a little thought to how you can reduce all of 
your configuration needs to bash-legal syntax. In particular, you can make 
use of Boolean flags and optional variables (see Chapter 5 and Recipe 15.11): 


# In config file 

VERBOSE=0 #0 or '' for off, 1 for on 
SSH_USER='jbagadonutz@' # Note trailing @, set to '' to use the 
current user 


# In script 
[ "SVERBOSE" ] || echo "Verbose msg from $0 goes to STDERR" >&2 


Eas 
ssh $SSH_USERSREMOTE_HOST [...] 
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Of course, depending on the user to get the configuration file correct can be 
chancy, so instead of requiring the user to read the comment and add the 
trailing @, we could do it in the script: 


# If SSSH_USER is set and doesn't have a trailing @ add it: 
[ -n "SSSH_USER" -a "SSSH_USER" = "S{SSH_USER%@}" ] && 
SSH_USER="$SSH_USER@" 


or just use: 
ssh ${SSH_USER:+${SSH_USER}@}S{REMOTE_HOST} [...] 


to make that same substitution right in place. The bash variable operator :+ 
will do the following: if SSSH_USER has a value, it will return the value to the 
right of the :+ (in this case we specified the variable itself along with an extra 
@); otherwise, if unset or empty, it will return nothing. 


See Also 
= Chapter 5 
= Recipe 10.2, “Reusing Code with Includes and Sourcing” 


m Recipe 15.11, “Getting Input from Another Machine” 


10.4 Defining Functions 


Problem 


There are several places in your shell script where you would like to give the 
user a usage message (a message describing the proper syntax for the 
command), but you don’t want to keep repeating the code for the same echo 
statement. Isn’t there a way to do this just once and have several references to 
it? If you could make the usage message its own script, then you could just 
invoke it anywhere in your original script—but that requires two scripts, not 
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one. Besides, it seems odd to have the message for how to use one script be 
the output of a different script. Isn’t there a better way to do this? 


Solution 


You need a bash function. At the beginning of your script, put something like 
this: 


function usage () 


t 
} 


printf "usage: %s [ -a | - b ] file1 ... filen\n" S${O##*/} > &2 


Then later in your script you can write code like this: 


if [ $# -lt 1] 
then 

usage 
fi 


Discussion 


Functions may be defined in several ways ([ function ] name [()] 
{compound-command } [ redirections ]). We could write a function 
definition any of these ways: 


function usage () 


{ 

printf "usage: %s [ -a | - b ] file1 ... filen\n" ${O##*/} > &2 
} 
function usage { 

printf "usage: %s [ -a | - b ] file1 ... filen\n" ${0##*/} > &2 
} 
usage () 
{ 

printf "usage: %s [ -a | - b ] file1 ... filen\n" ${0##*/} > &2 
} 
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Either the reserved word function or the trailing literal () must be present. 
If Function is used, the () are optional. We like using the word function 
because it is very clear and readable, and it is easy to grep for; e.g., grep 
“'function' script will list the functions in your script file. 


This function definition should go at the top of your shell script, or at least 
somewhere before you need to invoke the function. The definition is, in a 
sense, just another bash statement. But once it has been executed, the 
function is defined. If you invoke the function before it is defined you will 
get a “command not found” error. That’s why we always put our function 
definitions first, before any other commands in our scripts. 


Our function does very little; it is just a printf statement. Because we have 
this one usage message embodied in a single function, though, if we ever add 
a new option we don’t need to modify several statements scattered 
throughout the script, just this one. 


The only argument to printf beyond the format string is $0, the name by 
which the shell script was invoked, modified (with the ## operator so that 
only the last part of any pathname is included. This is similar to using 
S(basename $0). 


Since the usage message is an error message, we redirect the output of the 
printf statement to standard error. We could also have put that redirection on 
the outside of the function definition, so that all output from the function 
would be redirected. This would be convenient if we had multiple output 
statements, like this: 


function usage () 


{ 
printf "usage: %s [ -a | - b ] file1 ... filen\n" ${0##*/} 
printf "example: %s -b *.jpg \n" ${0##*/} 
printf "or else: %s -a myfile.txt yourfile.txt \n" ${0##*/} 
} > &2 


See Also 


= Recipe 5.20, “Using bash for basename” 


324 


= Recipe 7.1, “Sifting Through Files for a String” 
= Recipe 16.15, “Creating a Better cd Command” 
= Recipe 16.16, “Creating and Changing Into a New Directory in One Step” 


= Recipe 19.14, “Avoiding “command not found” When Using Functions” 


10.5 Using Functions: Parameters and Return 
Values 


Problem 


You want to use a function and you need to get some values into the function. 
How do you pass in parameters? How do you get values back? 


Solution 


You don’t put parentheses around the arguments like you might expect from 
some programming languages. Put any parameters for a bash function right 
after the function’s name, separated by whitespace, just as if you were 
invoking any shell script or command. Don’t forget to quote them if 
necessary! 


# define the function: 
function max () 
{...} 

# 

# call the function: 
# 

max 128 SSIM 

max VAR SCNT 


You have two ways to get values back from a function. First, you can assign 
values to variables inside the body of your function, as in Example 10-1. 
Those variables will be global to the whole script unless they are explicitly 
declared Local within the function. 
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Example 10-1. ch10/func_max. 1 


# cookbook filename: func_max.1 


# define the function: 
function max () 


{ 
local HIDN 
if [ $1 -gt $2 ] 
then 
BIGR=$1 
else 
BIGR=$2 
fi 
HIDN=5 
} 


For example: 


# call the function: 
max 128 SIM 

# use the result: 
echo SBIGR 


The other way is to use echo or printf to send the output to standard output, as 
in Example 10-2. 


Example 10-2. ch10/func_max.2 


# cookbook filename: func_max.2 


# define the function: 
function max () 


i. 
if [ $1 -gt $2 ] 
then 
echo $1 
else 
echo $2 
fi 
} 


Then you must invoke the function inside a $(), capturing the output and 
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using the result, or it will be wasted on the screen. For example: 


# call the function: 
BIGR=$(max 128 $SIM) 
# use the result 
echo SBIGR 


Discussion 


Putting parameters on the invocation of the function is just like calling any 
shell script. The parameters are just the other words on the command line. 


Within the function, the parameters are referred to as if they were command- 
line arguments by using $1, $2, etc. However, $0 is left alone. It remains the 
name by which the entire script was invoked. On return from the function, 
$1, $2, etc. are back to referring to the parameters with which the script was 
invoked. 


Also of interest is the SFUNCNAME array. S$FUNCNAME all by itself references the 
zeroth element of the array, which is the name of the currently executing 
function. In other words, SFUNCNAME is to a function as $0 is to a script, 
except without all the path information. The rest of the array elements are 
what amounts to a call stack, with “main” as the bottom or last element. This 
variable only exists while a function is executing. 


We included the useless variable $HIDN just to show that it is local to the 
function definition. Even though we can assign it values inside the function, 
any such value would not be available elsewhere in the script. It is a variable 
whose value is local to that function; it comes into existence when the 
function is called, and is gone once the function returns. 


Returning values by setting variables is more efficient, and can handle lots of 
data— many variables can be set—but the approach has its drawbacks. 
Notably, it requires that the function and the rest of the script agree on 
variable names for the information hand-off. This kind of coupling has 
maintenance issues. The other approach, using the output as the way to return 
values, does reduce this coupling, but is limited in its usefulness—tt is 


327 


limited in how much data it can return before your script has to spend lots of 
effort parsing the results of the function. So which to use? As with much of 
engineering, this, too, is a trade-off and you have to decide based on your 
specific needs. 


See Also 


= Recipe 1.8, “Using Shell Quoting” 
= Recipe 16.5, “Changing Your $PATH Temporarily” 


10.6 Trapping Interrupts 


Problem 


You are writing a script that needs to be able to trap signals and respond 
accordingly. 


Solution 


Use the trap utility to set signal handlers. First, use trap -l(or kill -L) to 
list the signals you may trap. They vary from system to system: 


# NetBSD 

$ trap -l 
1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 
5) SIGTRAP 6) SIGABRT 7) SIGEMT 8) SIGFPE 


9) SIGKILL 10) SIGBUS 11) SIGSEGV 12) SIGSYS 
13) SIGPIPE 14) SIGALRM 15) SIGTERM 16) SIGURG 
17) SIGSTOP 18) SIGTSTP 19) SIGCONT 20) SIGCHLD 
21) SIGTTIN 22) SIGTTOU 23) SIGIO 24) SIGXCPU 
25) SIGXFSZ 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 
29) SIGINFO 30) SIGUSR1 31) SIGUSR2 32) SIGPWR 

$ 

# Linux (re-wrapped to fit on the page) 


$ trap -l 
1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 
5) SIGTRAP 6) SIGABRT 7) SIGBUS 8) SIGFPE 
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9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2 

13) SIGPIPE 14) SIGALRM 15) SIGTERM 16) SIGSTKFLT 
17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP 

21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 

25) SIGXFSZ 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 
29) SIGIO 30) SIGPWR 31) SIGSYS 34) SIGRTMIN 
35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3 38) SIGRTMIN+4 
39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8 
43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 
47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 
51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10 
55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7 58) SIGRTMAX-6 
59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2 
63) SIGRTMAX-1 64) SIGRTMAX 

$ 


Next, set your trap(s) and signal handlers. Note that the exit status of your 
script will be 128+stgnal number if the command was terminated by signal 
signal number. Here is a simple case where we only care that we got a 


signal and don’t care what it was. If our trap had been trap 
HUP INT QUIT TERM, this script would be rather hard to kill because any of 
those signals would just be ignored: 


'' ABRT EXIT 


$ cat hard_to_kill 

#!/bin/bash 

trap ' echo "You got me! $?" ' ABRT EXIT HUP INT QUIT TERM 
trap ' echo "Later... $?"; exit ' USR1 

sleep 120 


$ ./hard_to_kill 
^CYou got me! 130 
You got me! 130 


$ ./hard_to_kill & 
[1] 26354 


$ kill -USR1 %1 

User defined signal 1 
Later... 158 

You got me! 0 
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[1]+ Done ./hard_to_kill 


$ ./hard_to_kill & 
[1] 28180 


$ kill %1 


You got me! 0 
[1]+ Terminated ./hard_to_kill 


Example 10-3 is a more interesting example. 


Example 10-3. ch10/hard_to_kill 


#!/usr/bin/env bash 
# cookbook filename: hard_to_kill 


function trapped { 
if [ "$1" = "USR1" ]; then 
echo "Got me with a $1 trap!" 
exit 
else 
echo "Received $1 trap--neener, neener" 
fi 
} 


trap "trapped ABRT" ABRT 

trap "trapped EXIT" EXIT 

trap "trapped HUP" HUP 

trap "trapped INT" INT 

trap "trapped KILL" KILL # This won't actually work 
trap "trapped QUIT" QUIT 

trap "trapped TERM" TERM 

trap "trapped USR1" USRi # This one is special 


# Just hang out and do nothing, without introducing "third-party" 
# trap behavior, such as if we used 'sleep' 
while (( 1 )); do 
# : is a NOOP 
done 


Here we invoke this example then try to kill it: 
$ ./hard_to_kill 
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ACReceived INT trap--neener, neener 
ACReceived INT trap--neener, neener 
ACReceived INT trap--neener, neener 

AZ 

[1]+ Stopped ./hard_to_kill 


$ kill -TERM %1 


[1]+ Stopped ./hard_to_kill 
Received TERM trap--neener, neener 


$ jobs 
[1]+ Stopped ./hard_to_kill 


$ bg 
[1]+ ./hard_to_kill & 


$ jobs 
[1]+ Running ./hard_to_kill & 


$ kill -TERM %1 
Received TERM trap--neener, neener 


$ kill -HUP %1 
Received HUP trap--neener, neener 


$ kill -USR1 %1 
Got me with a USR1 trap! 
Received EXIT trap--neener, neener 


[1]+ Done ./hard_to_kill 


Discussion 


First, we should mention that you can’t actually trap -SIGKILL (-9). That 
signal kills processes dead immediately, so they have no chance to trap 
anything—so maybe our examples weren’t really so hard to kill after all. But 
remember that this signal does not allow the script or program to clean up or 
shut down gracefully at any time. That’s often a bad thing, so try to avoid 
using kill -KILL unless you have no other choice. 
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Usage for trap is as follows: 
trap [-lp] [arg] [signal [signal] ] 


The first nonoption argument to trap is the code to execute when the given 
signal is received. As shown in the previous examples, that code can be self- 
contained, or it can be a call to a function. For most nontrivial uses a call to 
one or more error handling functions is probably best, since that lends itself 
well to cleanup and graceful termination features. If this argument is the null 
string, the given signal or signals will be ignored. If the argument is - or 
missing, but one or more signals are listed, they will be reset to the shell 
defaults. -l lists the signal names, as shown in the Solution section, while -p 
will print any current traps and their handlers. 


When using more than one trap handler, we recommend you take the extra 
time to alphabetize the signal names because that makes them easier to read 
and find later on. 


As noted previously, the exit status of your script will be 128+signal 
number if the command was terminated by signal signal number. 


There are three pseudosignals for various special purposes. The DEBUG signal 
is similar to EXIT but is used before every command for debugging purposes. 
The RETURN signal is triggered when execution resumes after a function or 
source (.) call. And the ERR signal is triggered after a simple command fails. 
Consult the Bash Reference Manual for more specific details and caveats, 
especially dealing with functions using the declare builtin or the set -o 
functrace option. 


TIP 


There are some POSIX differences that affect trap. As noted in the Bash 
Reference Manual, “Starting bash with the - -posix command-line option or 
executing 'set -o posix' while Bash is running will cause Bash to conform 
more closely to the POSIX standard by changing the behavior to match that 
specified by POSIX in areas where the Bash default differs.” In particular, this 
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will cause kill and trap to display signal names without the leading SIG and the 
output of kKiLL -1l will be different. Also, trap will handle its argument 
somewhat more strictly; in particular, it will require a leading - in order to reset 
the trap to the shell default. In other words, it requires trap -USR1, not just 
trap USR1. We recommend that you always include the - even when not 
necessary, because it makes your intent clearer in the code. 


See Also 
m help trap 


m Recipe 1.19, “Learning More About bash Documentation” 
= Recipe 10.1, ““Daemon-izing” Your Script” 

= Recipe 14.11, “Using Secure Temporary Files” 

= Recipe 17.7, “Clearing the Screen When You Log Out” 


10.7 Redefining Commands with alias 


Problem 


You'd like to slightly alter the definition of a command, perhaps so that you 
always use a particular option on the command (e.g., always using -a on the 
ls command or -i on the rm command). 


Solution 


Use the alias feature of bash for interactive shells (only). The alias command 
is smart enough not to go into an endless loop when you say something like: 


alias ls='ls -a' 


In fact, just type alias with no other arguments and you can see a list of 
aliases that are already defined for you in your bash session. Some 
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installations may already have several available for you. 


Discussion 


The alias mechanism is a straightforward text substitution. It occurs very 
early in the command-line processing, so other substitutions will occur after 
the alias. For example, if you want to define the single letter “h” to be the 
command that lists your home directory, you can do it like this: 


alias h='ls SHOME' 
or like this: 
alias h='ls ~' 


The use of single quotes is significant in the first instance, meaning that the 
variable $HOME will not be evaluated when the definition of the alias is made. 
Only when you run the command will the (string) substitution be made, and 
only then will the $HOME variable be evaluated. That way if you change the 
definition of $HOME the alias will move with it, so to speak. 


If, instead, you used double quotes, then the substitution of the variable’s 
value would be made right away and the alias would be defined with the 
value of SHOME substituted. You can see this by typing alias with no 
arguments so that bash lists all the alias definitions. You would see 
something like this: 


alias h='ls /home/youracct' 


If you don’t like what your alias does and want to get rid of it, just use 
unalias and the name of the alias that you no longer want. For example: 


\unalias h 
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will remove the definition we made earlier. If you get really messed up, you 
can use unalias -a to remove all the alias definitions in your current shell 
session. Why did we prefix the previous command with a backslash? The 
backslash prefix disables alias expansion for any command, so it is standard 
security best practice to use \unaLias just in case some bad actor has aliased 
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unalias, perhaps to “:”, to make it ineffective: 


$ alias unalias=':' 


$ alias unalias 
alias unalias=':' 


$ unalias unalias 


$ alias unalias 
alias unalias=':' 


$ \unalias unalias 


$ alias unalias 
bash: alias: unalias: not found 


Aliases do not allow arguments. For example, you cannot do this: 
alias='mkdir $1 && cd $1' 


The difference between $1 and SHOME is that $HOME is defined (one way or 
another) when the alias itself is defined, while you’d expect $1 to be passed 
in at runtime. Sorry, that doesn’t work. Use a function instead. 


See Also 

=» Appendix C for more on command-line processing 

= Recipe 10.4, “Defining Functions” 

= Recipe 10.5, “Using Functions: Parameters and Return Values” 


= Recipe 14.4, “Clearing All Aliases” 
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b) 


= Recipe 16.16, “Creating and Changing Into a New Directory in One Step’ 


10.8 Avoiding Aliases and Functions 


Problem 


You’ve written an alias or function to override a real command, but now you 
want to execute the real command. 


Solution 


Use the bash shell’s builtin command to ignore shell functions and aliases 
and run an actual builtin command. 


Use the command command to ignore shell functions and aliases and run an 
actual external command. 


If you only want to avoid alias expansion, but still allow function definitions 
to be considered, then prefix the command with \ to just prevent alias 
expansion. 


Use the type command (also with -a) to figure out what you’ ve got. 


Here are some examples: 


$ alias echo='echo ~~~' 


$ echo test 
~~~ test 


$ \echo test 
test 


$ builtin echo test 
test 


$ type echo 
echo is aliased to ‘echo ~~~ 
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$ unalias echo 


$ type echo 
echo is a shell builtin 


$ type -a echo 
echo is a shell builtin 


echo is /bin/echo 


$ echo test 
test 


Here is a function definition that we will discuss: 


function cd () 


{ 
ar [SiS "sss" J] 
then 
builtin cd ../.. 
else 
builtin cd "$1" 
fi 
} 
Discussion 


The alias command is smart enough not to go into an endless loop when you 

say something like alias ls='ls-a' or alias echo='echo ~~~', so in our 
first example we need to do nothing special on the righthand side of our alias 

definition to refer to the actual echo command. 


When we have echo defined as an alias, the type command will not only tell 
us that this is an alias, but show us the alias definition. Similarly, with 
function definitions, we would be shown the actual body of the function. 
type -a some_command will show us all of the places (aliases, builtins, 
functions, and external) that contain some_command (as long as we are not 
also using -p). 


In our last example, the function overrides the definition of cd so that we can 
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add a simple shortcut. We want our function to understand that cd ... 
means to go up two directories; i.e., cd ../.. (see Recipe 16.15. All other 
arguments will be treated as normal. Our function simply looks for a match 
with ... and substitutes the real meaning. But how, within (or without the 
function, do we invoke the underlying cd command so as to actually change 
directories? The builtin command tells bash to assume that the command that 
follows is a shell builtin command and not to use any alias or function 
definition. We use it within the function, but it can be used at any time to 
refer, unambiguously, to the actual command, avoiding any function name 
that might be overriding it. 


If your function name is that of an executable, like /s, and not a builtin 
command, then you can override any alias and/or function definition by just 
referring to the full path to the executable, such as /bin//s rather than just ¿s as 
the command. If you don’t know its full path, just prefix the command with 
the keyword command and bash will ignore any alias and function definitions 
with that name and use the actual command. Please note, however, that the 
SPATH variable will still be used to determine the location of the command. If 
you are running the wrong /s because your $PATH has some unexpected 
values, adding command will not help in that situation. 


See Also 

m help builtin 

=» help command 

m help type 

= Recipe 14.4, “Clearing All Aliases” 

= Recipe 16.15, “Creating a Better cd Command” 


10.9 Counting Elapsed Time 


Problem 
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You want to display how long a script, or an operation in a script, takes. 


Solution 
Use the time builtin or the bash variable SSECONDS. 


Discussion 


time reports the time used by a process or pipeline in a variety of ways: 


$ time sleep 4 


real 0m4.029s 
user 0m0. 000s 
sys 0m0. 000s 


$ time sha256sum /bin/* &> /dev/null 


real 0m1.252s 
user 0m0. 072s 
sys Om0.028s 


You can use time for commands or functions inside a script, but you can’t 
time the entire script from inside itself. You can certainly add time to a 
calling script or cron job, but be aware if you add it to cron that there will 
always be output, so you will always get a cron email about the run. 


If that seems like overkill or you just want to know how long the entire script 


took, you can use $SECONDS. According to the Bash Reference Manual: 


[SSECONDS/ expands to the number of seconds since the shell was started. 
Assignment to this variable resets the count to the value assigned, and the 
expanded value becomes the value assigned plus the number of seconds 
since the assignment. 


Examples: 


$ cat seconds 
started="SSECONDS" 
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sleep 4 
echo "Run-time = $(($SECONDS - Sstarted)) seconds..." 


$ bash seconds 
Run-time = 4 seconds... 


$ time bash seconds 
Run-time = 4 seconds... 


real 0m4.003s 
user 0m0. 000S 
sys 0m0. 000s 


See Also 
m help time 


= The Bash Reference Manual for your bash version (see 
http://www. bashcookbook.com/bashinfo/) 


10.10 Writing Wrappers 


Problem 


You have a series of related commands or tools that you often need to use in 
an ad hoc manner, and you want to collect them in one place to make them 
easier to use and remember. 


Solution 


Write a shell script “wrapper” using case. .esac blocks as needed. 


Discussion 


There are two basic ways to handle needs like this. One is to write a lot of 
tiny shell scripts, or perhaps aliases, to handle all the needs. This is the 
approach taken by BusyBox where a large number of tools are really just 
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symlinks to a single binary. The other is like the majority of revision control 
tools, where you call a single binary like a “prefix,” then add the action or 
command. Both approaches have merit, but we tend to prefer the second one 
because you only have to remember the single prefix command. 


There is an excellent discussion of this concept and a more complicated 
implementation in Signal v. Noise; we encourage you to read about it. Our 
implementation is a bit simpler, and has some handy tricks. Some of our 
basic design considerations are as follows: 


= Simple to read and understand 

= Simple to add to 

= Built-in, inline help that’s easy to write 
= Easy to use and remember 


We wrote the second edition of this book in Asciidoc, and there is a lot of 
markup to remember, so here’s an excerpt from a tool we wrote to help us 
(Example 10-4). This tool can get input from the command line or it can read 
from and write to the Linux clipboard. 


Example 10-4. ch10/ad 


#!/usr/bin/env bash 
# cookbook filename: ad 
# O'Reilly "book" tool wrapper for Asciidoc 


# Trivial sanity checks Oo 
[ -n "SBOOK_ASC" ] || { 

echo "FATAL: must export \SBOOK_ASC to the Location of 
',..bcb2/head/asciidoc/'!" 

exit 1 


\cd "SBOOK_ASC" || { 
echo "FATAL: can't cd to '$BOOK_ASC'!" 


exit 2 
} 
SELF="$0" # For clarity in recursion 12) 
action="$1" # For code readability 8 
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shift # Remove that argument from the list 


# If `xsel` is executable and we have no more arguments... 
[ -x /usr/bin/xsel -a $# -lt 1 ] && { 
# Read/write the clipboard 
text=$(xsel -b) 
function Output { 
echo -en "$*" | xsel -bi 
} 
} || { # Otherwise... 
# Read/write STDIN/STDOUT 
text=$* 
function Output { 
echo -en "$*" 
} 
} 


case "$action" in 


HHEHHHHHHHHHHHHHHHHEHEHEE EERE arere 
# Content/Markup 


rec|recipe ) # Create the tags for a new recipe 
id="S(SSELF id $text)" # Create an "ID" 
Output "S$(cat <<- EoF 
[[$id]] 
=== $text 


[ [problem-Sid] ] 
==== Problem 


[[solution-$Sid ] ] 
==== SoLution 


[ [discussion-S$id] ] 
==== Discussion 


[[see_also-Sid] ] 
==== See Also 

* Uman \° 

* item1 

* <<xref-id-here>> 
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* URL[ text ] 


EoF 
J” 
table ) # Create the tags for a new table © 
Output "$(cat <<- EoF 
.A Table 
[options="header" ] 
—— 
|head|h|h 
|Jcell|cl|c 
|cell|clc 
|======= 
EoF 
hh 
H ass 
#H## Headers 
h1 ) # Inside chapter head 1 (really Asciidoc h3) @ 
Output "=== $text" 
h2 ) # Inside chapter head 2 (really Asciidoc h4) 
Output "==== $text" 
h3 ) # Inside chapter head 3 (really Asciidoc h5) 
Output “s==== $text" 
### Lists 
bul|bullet ) # Bullet list (.. = level 2, + = multiline 
element) 


Output ". $text" 


nul|number|order* ) # Num./ordered list (## = level 2, + = multiline 
element) 
Output "# $text" 
term ) # Terms 
Output "term_here::\n $text" © 
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cleanup ) ## Clean up all the xHTML/XML/PDF cruft 
rm -fv {ch??,app?}.{pdf,xml, html} book.xml docbook-xsl.css @ 


\cd - > /dev/null # UGLY cheat to revert the 'cd' above... 
# See also: http://stackoverflow.com/questions/59895/ 
# can-a-bash-script-tell-what-directory-its-stored-in 
( echo "Usage:" 


egrep '\)[[:space:]]+# ' SSELF 
echo '' 
egrep '\)[[:space:]]+## '  SSELF ) | more o 


esac 


0o00 


o 


© 


Sanity-check required variables and locations. 

Set a more readable name for recursion. 

Set a more readable name for the command or action we’re going to take. 
Remove that argument from the list so we don’t reuse or include it in the 
input or output later. 

If the xsel command is available and executable, and we passed no other 
arguments, then set up the input and output to be from and to the 
clipboard. That turns this script into an application-generic macro tool! 
No matter what editor you are using, if you have a GUI and read from 
and write to the clipboard, if you switch to a terminal session you can 
copy text, process it, and paste it easily, which is a really handy thing to 
be able to do! 

Each block in the case. .esac is both the code and the documentation. 
The number of # characters determines the section, so the code can be in 
whatever order makes sense, but the help/usage can vary from that. 

Take the input text and make a recursive call to get an ID out of that, then 
output the boilerplate markup. 

Note that inside the here-document the indentation must be tabs. 
Sometimes the boilerplate markup doesn’t include any input text. 
Sometimes the operation is very simple, like just remembering how many 
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equals signs are needed. 


@ Sometimes the operation is a bit more complicated, with embedded 
newlines and expanded escape characters. 


ə Actions can do anything you can think of and figure out how to automate! 


® Ifyou don’t provide any arguments, or provide incorrect arguments, even 


including ones like -h or --help, you get a generated usage message. 


@ We wrap the blocks in a () subshell to get the output in the right order 


and send it all into the more command. The two egrep commands display 


our case. .esac section lines, as in @, which are both code and 


documentation, grouped by the count of # characters (one or two). 


TIP 
Use pbcopy and pbpaste instead of xsel on a Mac. 


Example usage: 


$ ad 
Usage: 

rec|recipe ) # Create 

table ) # Create 

h1 ) # Inside 
h3) 

h2 ) # Inside 
h4) 

h3 ) # Inside 
h5) 

bul|bullet ) # Bullet 
element) 


the tags for a new recipe 
the tags for a new table 
chapter heading 1 (really Asciidoc 
chapter heading 2 (really Asciidoc 
chapter heading 3 (really Asciidoc 


list (.. = level 2, + = multiline 


nul|number|order* ) # Num./ordered list (## = level 2, += 


multiline element) 
term ) # Terms 


cleanup ) ## Clean 


up all the xHTML/XML/PDF cruft 
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To use ad to create the tags for a new recipe, like this one, you would type 
out the title, select it, open or flip to a terminal window, type ad rec, flip 
back to your editor, and paste it in. It’s much easier than it sounds and much 
faster to do than to describe. The beauty of this kind of script is that it works 
for all kinds of problems, it’s usually easy to extend, and the usage reminders 
all but write themselves. We’ve used scripts following this pattern to: 


= Write the second edition of this book 


= Wrap up various SSH commands to do common chores on groups of 
servers 


= Collect various Debian package system tools, prior to the advent of apt 


=» Automate various “cleanup” tasks like trimming whitespace, sorting, and 
performing various simple text manipulations like stripping out rich-text 
formatting 


=» Automate grep commands to search various specific file types and 
locations for notes and archived documentation 


See Also 


a /Attps://signalvnoise.com/posts/3264-automating-with-convention- 
introducing-sub 


— https://github.com/37signals/sub 
= Appendix D 
= Recipe 15.17, “Automating a Process Using Phases” 


= /ttp://docs.atlas.oreilly.com/writing_in_asciidoc.html 
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Chapter 11. Working with Dates 
and Times 


Working with dates and times should be simple, but it’s not. Regardless of 
whether you’re writing a shell script or a much larger program, timekeeping 
is full of complexities: different formats for displaying the time and date, 
Daylight Saving Time, leap years, leap seconds, and all of that. For example, 
imagine that you have a list of contracts and the dates on which they were 
signed. You’d like to compute expiration dates for all of those contracts. It’s 
not a trivial problem: does a leap year get in the way? Is it the sort of contract 
where Daylight Saving Time is likely to be a problem? And how do you 
format the output so that it’s unambiguous? Does 7/4/07 mean July 4, 2007, 
or does it mean April 7? 


Dates and times permeate every aspect of computing. Sooner or later you are 
going to have to deal with them: in system, application, or transaction logs; in 
data processing scripts; in user or administrative tasks; and more. This 
chapter will help you deal with them as simply and cleanly as possible. 
Computers are very good at keeping time accurately, particularly if they are 
using the Network Time Protocol (NTP) to keep themselves synced with 
national and international time standards. They’re also great at understanding 
the variations in Daylight Saving Time from locale to locale. To work with 
time in a shell script, you need the Unix date command (or even better, the 
GNU version of the date command, which is standard on Linux). date is 
capable of displaying dates in different formats and even doing date 
arithmetic correctly. 

Note that gawk (the GNU version of awk) has the same strftime formatting as 
the GNU date command. We’re not going to cover gawk usage here except 
for one trivial example. We recommend sticking with GNU date because it’s 
much easier to use and it has the very useful -d argument. But keep gawk in 
mind should you ever encounter a system that has gawk but not GNU date. 
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11. 1 Formatting Dates for Display 


Problem 


You need to format dates or times for output. 


Solution 


Use the date command with a strftime format specification. See “Date and 
Time String Formatting with strftime” in Appendix A or the strftime 
manpage for the list of format specifications supported: 


# Setting environment variables can be helpful in scripts: 
$ STRICT_ISO_8601='%Y-%m-%dT%H:%M:%S%z' # Strict ISO 8601 format 


$ IS0_8601='"%Y-%m-%d %H:%M:%S %Z' # Almost IS08601, but more 
human-readable 

$ IS0_8601_1='"%Y-%m-%d %T %Z' # %T is the same as %H:%M:%S 
$ DATEFILE='%Y%m%d%H%M%S ' # Suitable for use in a 
filename 


$ date "+$IS0_8601" 
2006-05-08 14:36:51 CDT 


$ gawk "BEGIN {print strftime(\"$IS0_8601\")}" 
2006-12-07 04:38:54 EST 


# Same as previous $ISO 8601 
$ date '+%Y-%m-%d %H:%M:%S %Z' 
2006-05-08 14:36:51 CDT 


$ date -d '2005-11-06' "+$ISO_8601" 
2005-11-06 00:00:00 CST 


$ date "+Program starting at: $ISO_8601" 
Program starting at: 2006-05-08 14:36:51 CDT 


$ printf "%b" "Program starting at: $(date '+$ISO_8601')\n" 
Program starting at: SISO 8601 


$ echo "I can rename a file like this: mv file.log file_$(date 
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+SDATEFILE). log" 
I can rename a file like this: mv file.log file_20060508143724.1log 


Discussion 


You may be tempted to place the + in the environment variable to simplify 
the later command, but some systems the date command is more picky about 
the existence and placement of the + than on others. Our advice is to 
explicitly add it to the date command itself. 

Many more formatting options are available; see the date manpage or the C 
strftime() function (man 3 strftime) on your system for a full list. 
Unless otherwise specified, the time zone is assumed to be local time as 
defined by your system. The %z format is a nonstandard extension used by 
the GNU date command; it may not work on your system. 


ISO 8601 is the recommended standard for displaying dates and times and 
should be used if at all possible. It offers a number of advantages over other 
display formats: 


= Itisarecognized standard. 
= It is unambiguous. 


= It is easy to read while still being easy to parse programmatically (e.g., 
using awk or cut). 


= It sorts as expected when used in columnar data or in filenames. 


Try to avoid MM/DD/YY or DD/MM/YY (or even worse, M/D/YY or 
D/M/YY) formats. They do not sort well and they are ambiguous, since either 
the day or the month may come first depending on geographical location, 
which also makes them hard to parse. Likewise, use 24-hour time when 
possible to avoid even more ambiguity and parsing problems. 


See Also 


=m man date 


= man 3 strftime 
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a /ttps://en.wikipedia.org/wiki/ISO_8601 
a /ttps://www.iso.org/iso-8601-date-and-time-format.html 
a “Date and Time String Formatting with strftime” in Appendix A 


11.2 Supplying a Default Date 


Problem 


You want your script to provide a useful default date, and perhaps prompt the 
user to verify it. 


Solution 


Using the GNU date command, assign the most likely date to a variable, then 
allow the user to change it (see Example 11-1). 


Example 11-1. ch11/default_date 


#!/usr/bin/env bash 
# cookbook filename: defauLt_date 


# Use noon time to prevent a script running around midnight and a clock a 
# few seconds off from causing off by one day errors. 
START_DATE=$(date -d 'last week Monday 12:00:00' '+%Y-%m-%d') 


while [ 1 ]; do 

printf "%b" "The starting date is SSTART_DATE, is that correct? 
(Y/new date)" 

read answer 


# Anything other than ENTER, "Y", or "y" is validated as a new date 
# Could use "[Yy]*" to allow the user to spell out "yes"... 
# Validate the new date format as: CCYY-MM-DD 
case "Sanswer" in 
[Yy]) break 


[0-9] [0-9] [0-9][0-9]-[0-9][0-9]-[0-9][0-9]) 


printf "%b" "Overriding SSTART_DATE with Sanswer\n" 
START_DATE="Sanswer" 
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*) printf "%b" "Invalid date, please try again...\n" 
esac 
done 


END_DATE=$(date -d "SSTART_DATE +7 days" '+%Y-%m-%d' ) 


echo "START_DATE: $START_DATE" 
echo "END_DATE: SEND_DATE" 


Discussion 


Not all date commands support the -d option, but the GNU version does. Our 
advice is to obtain and use the GNU date command if at all possible. 


Leave out the user verification code if your script is running unattended or at 
a known time (e.g., from cron). 


See Recipe 11.1 for information about how to format the dates and times. 


We use code like this in scripts that generate SQL queries. The script runs at 
a given time and creates a SQL query for a specific date range to generate a 
report. 


See Also 


= man date 
= Recipe 11.1, “Formatting Dates for Display” 
m Recipe 11.3, “Automating Date Ranges” 


11.3 Automating Date Ranges 


Problem 


You have one date (perhaps from Recipe 11.2) and you would like to 
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generate another automatically. 


Solution 


The GNU date command is very powerful and flexible, but the power of -d 
isn’t documented well. Your system may document it under getdate (try the 
getdate manpage). Here are some examples: 


$ date '+%Y-%m-%d %H:%M:%S %z' 
2005-11-05 01:03:00 -0500 


$ date -d 'today' '+%Y-%m-%d %H:%M:%S %z' 
2005-11-05 01:04:39 -0500 


$ date -d ‘yesterday’ '+%Y-%m-%d %H:%M:%S %z' 
2005-11-04 01:04:48 -0500 


$ date -d ‘tomorrow’ '+%Y-%m-%d %H:%M:%S %z' 
2005-11-06 01:04:55 -0500 


$ date -d 'Monday' '+%Y-%m-%d “%H:%M:%S %z' 
2005-11-07 00:00:00 -0500 


$ date -d 'this Monday' '+%Y-%m-%d %H:%M:%S %z' 
2005-11-07 00:00:00 -0500 


$ date -d 'last Monday’ '+%Y-%m-%d “%H:%M:%S %z' 
2005-10-31 00:00:00 -0500 


$ date -d 'next Monday’ '+%Y-%m-%d %H:%M:%S %z' 
2005-11-07 00:00:00 -0500 


$ date -d ‘Last week" '+%Y-%m-%d “H:%M:%S %z' 
2005-10-29 01:05:24 -0400 


$ date -d 'next week" '+%Y-%m-%d “H:%M:%S %z' 
2005-11-12 01:05:29 -0500 


$ date -d '2 weeks’ '+%Y-%m-%d “H:%M:%S %z' 
2005-11-19 01:05:42 -0500 
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$ date -d '-2 weeks’ '+%Y-%m-%d %H:%M:%S %z' 
2005-10-22 01:05:47 -0400 


$ date -d '2 weeks ago' '+%Y-%m-%d %H:%M:%S %z' 
2005-10-22 01:06:00 -0400 


$ date -d '+4 days" '+%Y-%m-%d %H:%M:%S %z' 
2005-11-09 01:06:23 -0500 


$ date -d '-6 days' '+%Y-%m-%d %H:%M:%S %z' 
2005-10-30 01:06:30 -0400 


$ date -d '2000-01-01 +12 days' '+%Y-%m-%d %H:%M:%S %z' 
2000-01-13 00:00:00 -0500 


$ date -d '3 months 1 day' '+%Y-%m-%d %H:%M:%S %z' 
2006-02-06 01:03:00 -0500 
Discussion 


The -d option allows you to specify a specific date instead of using “now,” 
but not all date commands support it. The GNU version does, and our advice 
is to obtain and use that version if at all possible. 


Using -d can be tricky. These arguments work as expected: 


$ date '+%a %Y-%m-%d' 
Sat 2005-11-05 


$ date -d 'today' '+%a %Y-%m-%d' 
Sat 2005-11-05 


$ date -d 'Saturday' '+%a %Y-%m-%d' 
Sat 2005-11-05 


$ date -d 'last Saturday' '+%a %Y-%m-%d' 
Sat 2005-10-29 


$ date -d 'this Saturday' '+%a %Y-%m-%d' 
Sat 2005-11-05 
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But if you run this on Saturday, you would expect to see next Saturday but 
instead will get today: 


$ date -d 'next Saturday' '+%a %Y-%m-%d' 
Sat 2005-11-05 
$ 


Also watch out for this week day, because as soon as the specified day is in 
the past, this week becomes next week. So, if you ran the following command 
on Saturday 2005-11-05 you would get these results, which may not be what 
you were expecting: 


$ date -d 'this week Friday’ '+%a %Y-%m-%d' 
Fri 2005-11-11 


The -d options can be incredibly useful, but be sure to thoroughly test your 
code and provide appropriate error checking. 


If you don’t have GNU date, you may find the following shell functions’ 
presented in “Shell Corner: Date-Related Shell Functions” in the September 
2005 issue of Unix Review, to be useful: 


pn_month 


Previous and next x months relative to the given month 


end_month 


End of month of the given month 


pn_day 


Previous and next x days relative to the given day 


cur_weekday 
Day of week for the given day 


pn_weekday 


Previous and next x days of the week relative to the given day 
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And these are available in newer versions of bash: 
pn_day_nr 

(Nonrecursive) Previous and next x days relative to the given day 
days between 

Number of days between two dates 


Note that pn_month, end_month, and cur_weekday are independent of the 
rest of the functions. However, pn_day is built on top of pn_month and 
end_month, and pn_weekday is built on top of pn_day and cur_weekday. 


See Also 


m man date 
=m man getdate 


a /http://www.drdobbs.com/shell-corner-date-related-shell- 
function/199102857 


= Recipe 11.2, “Supplying a Default Date” 


11.4 Converting Dates and Times to Epoch 
Seconds 


Problem 


You want to convert a date and time to epoch seconds to make it easier to do 
date and time arithmetic. 


Solution 


Use the GNU date command with the nonstandard -d option and a standard 
%S format: 


# "Now" is easy 
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$ date '+%s' 
1131172934 


# Some other time needs the nonstandard -d 
$ date -d '2005-11-05 12:00:00 +0000' '+%s' 
1131192000 


Epoch seconds are simply the number of seconds since the epoch (which is 
midnight on January 1, 1970, also known as 1970-01-01T00:00:00). This 
command simply starts at the epoch, adds the epoch seconds, and displays the 
date and time as you wish. 


Discussion 


If you do not have the GNU date command available, this is a harder problem 
to solve. Our advice is to obtain and use the GNU date command if at all 
possible. If that is not possible, you might be able to use Perl. Here are three 
ways to print the time right now in epoch seconds: 


$ perl -e 'print time, qq(\n);' 
1154158997 


# Same as above 
$ perl -e 'use Time::Local; print timelocal(localtime()) . qq(\n);' 
1154158997 


$ perl -e 'use POSIX qw(strftime); print strftime("%s", Localtime()) . 


qq(\n);' 
1154159097 


Using Perl to convert a specific day and time instead of “right now” is even 
harder due to Perl’s date/time data structure. Years start at 1900 and months 
(but not days) start at 0 instead of 1. The format of the command 1s: 
timelocal(sec, min, hour, day, month-1, year-1900). So, to convert 
2005-11-05 06:59:49 to epoch seconds: 


# The given time is in local time 
$ perl -e ‘use Time::Local; 
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> print timelocal("49", "59", "06", "05", "10", "105") . qq(\n);' 
1131191989 


# The given time is in UTC time 
$ perl -e ‘use Time::Local; 


= print timegm("49", "59", "06", "o5", "10", "105") 3 qq(\n);' 
1131173989 


See Also 


m man date 
m Recipe 11.5, “Converting Epoch Seconds to Dates and Times” 


= “Date and Time String Formatting with strftime” in Appendix A 


11.5 Converting Epoch Seconds to Dates and 
Times 


Problem 


You need to convert epoch seconds to a human-readable date and time. 


Solution 
Use the GNU date command with your desired format from Recipe 11.1: 
$ EPOCH='1131173989' 


$ date -d "1970-01-01 UTC SEPOCH seconds" +"%Y-%m-%d %T %z" 
2005-11-05 01:59:49 -0500 


$ date --utc --date "1970-01-01 SEPOCH seconds" +"%Y-%m-%d %T %z" 
2005-11-05 06:59:49 +0000 


Discussion 


If you don’t have GNU date on your system you can try one of these Perl 
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one-liners: 


$ EPOCH='1131173989' 


$ perl -e "print scalar(gmtime(SEPOCH)), qq(\n);" # UTC 
Sat Nov 5 06:59:49 2005 


$ perl -e "print scalar(localtime(SEPOCH)), qq(\n);" # Your local time 
Sat Nov 5 01:59:49 2005 


$ perl -e "use POSIX qw(strftime); 
> print strftime('%Y-%m-%d %H:%M:%S', Localtime(SEPOCH)), qq(\n);" 
2005-11-05 01:59:49 


See Also 

m man date 

m Recipe 11.1, “Formatting Dates for Display” 

= Recipe 11.4, “Converting Dates and Times to Epoch Seconds” 


a “Date and Time String Formatting with strftime” in Appendix A 


11.6 Getting Yesterday or Tomorrow with Perl 


Problem 


You need to get yesterday or tomorrow’s date, and you have Perl but not 
GNU date on your system. 


Solution 


Use these Perl one-liners, adjusting the number of seconds added to or 
subtracted from time: 


# Yesterday at this same time (note subtraction) 
$ perl -e "use POSIX qw(strftime) ; 
> print strftime('%Y-%m-%d', Localtime(time - 86400)), qq(\n);" 
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2005-11-04 


# Tomorrow at this same time (note addition) 

$ perl -e "use POSIX qw(strftime); 

> print strftime('%Y-%m-%d', Localtime(time + 86400)), qq(\n);" 
2005-11-06 


Discussion 


This is really just a specific application of the preceding recipes, but it’s so 
common that it’s worth talking about by itself. See Recipe 11.7 for a handy 
table of values that may be of use. 


See Also 


= Recipe 11.2, “Supplying a Default Date” 

= Recipe 11.3, “Automating Date Ranges” 

= Recipe 11.4, “Converting Dates and Times to Epoch Seconds” 

= Recipe 11.5, “Converting Epoch Seconds to Dates and Times” 

= Recipe 11.7, “Figuring Out Date and Time Arithmetic” 

a “Date and Time String Formatting with strftime” in Appendix A 


11.7 Figuring Out Date and Time Arithmetic 


Problem 


You need to do some kind of arithmetic with dates and times. 


Solution 


If you can’t get the answer you need using the date command (see Recipe 
11.3), convert your existing dates and times to epoch seconds using Recipe 
11.4, perform your calculations, then convert the resulting epoch seconds 
back to your desired format using Recipe 11.5. 
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TIP 


If you don’t have GNU date, you may find the shell functions presented in 
“Shell Corner: Date-Related Shell Functions” in the September 2005 issue of 
Unix Review to be very useful. See Recipe 11.3. 


For example, suppose you have log data from a machine where the time was 
badly off. Everyone should already be using the Network Time Protocol so 
this doesn’t happen, but just suppose: 


CORRECTION='172800' # 2 days' worth of seconds 


# Code to extract the date portion from the data 
# into $bad_date goes here 


# Suppose it's this: 
bad_date='Jan 2 05:13:05' # syslog-formatted date 


# Convert to epoch second using GNU date 
bad_epoch=$(date -d "Sbad_ date" '+%s') 


# Apply correction 
good _epoch=$(( bad_epoch + SCORRECTION )) 


# Make corrected date human-readable, with GNU date 
good date=$(date -d "1970-01-01 UTC $good_epoch seconds") 
good date_iso=$(date -d "1970-01-01 UTC $good_epoch seconds" +'%Y-%m-%d 


%T') 

Date 

echo "bad_ date: Sbad_date" 

echo "bad_epoch: Sbad_epoch" 
echo "Correction: +SCORRECTION" 
echo "good_ epoch: Sgood_epoch" 
echo "good date: S$good_date" 
echo "good date_iso: $good_date_iso" 


# Code to insert the $good_date back into the data goes here 
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WARNING 


Watch out for years! Some Unix commands, like /s and syslog, try to be easy to 
read and omit the year under certain conditions. You may need to take that into 
account when calculating your correction factor. If you have data from a large 
range of dates or from different time zones, you will have to find some way to 
break it into separate files and process them individually. 


Discussion 


Dealing with any kind of date arithmetic is much easier using epoch seconds 
than any other format of which we are aware. You don’t have to worry about 
hours, days, weeks, or years; you just do some simple addition or subtraction 
and you’re all set. Using epoch seconds also avoids all the convoluted rules 
about leap years and seconds, and if you standardize on one time zone 
(usually UTC, which used to be called GMT) you can even avoid time zones. 


Table 11-1 lists values that may be of use. 


Table 11-1. Conversion table of 
common epoch time values 


Seconds Minutes Hours Days 


60 1 

300 5 

600 10 

3,600 60 1 


18,000 300 5 
36,000 600 10 
86,400 1,440 24 1 
172,800 2,880 48 2 


604,800 10,080 168 7 


1,209,600 20,160 336 14 
2,592,000 43,200 720 30 


31,536,000 525,600 8,760 365 


See Also 


a /ttp://www.jpsdomain.org/networking/time.html 

m Recipe 11.3, “Automating Date Ranges” 

= Recipe 11.4, “Converting Dates and Times to Epoch Seconds” 
= Recipe 11.5, “Converting Epoch Seconds to Dates and Times” 


= Recipe 13.13, “Isolating Specific Fields in Data” 


11.8 Handling Time Zones, Daylight Saving 
Time, and Leap Years 


Problem 


You need to account for time zones, Daylight Saving Time, and leap years or 
seconds. 


Solution 
Don’t. 


Discussion 


This is a lot trickier than it sounds. Leave it to code that’s already been in use 
and debugged for years, and just use a tool that can handle your needs. Odds 
are high that one of the other recipes in this chapter has covered what you 
need, probably using GNU date. If not, there is almost certainly another tool 
out there that can do the job. For example, there are a number of excellent 
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Perl modules that deal with dates and times. 


Really, we aren’t kidding. This is a real nightmare to get right. Save yourself 
a lot of agony and just use a tool. 


See Also 


= Recipe 11.1, “Formatting Dates for Display” 

= Recipe 11.3, “Automating Date Ranges” 

= Recipe 11.4, “Converting Dates and Times to Epoch Seconds” 
= Recipe 11.5, “Converting Epoch Seconds to Dates and Times” 


= Recipe 11.7, “Figuring Out Date and Time Arithmetic” 


11,9 Using date and tron to Run a Script on 
the Nth Day 


Problem 


You need to run a script on the Nth weekday of the month (e.g., the second 
Wednesday), and most crons will not allow that. 


Solution 


Use a bit of shell code in the command to be run. In your Linux Vixie-cron 
crontab, adapt one of the following lines. If you are using another cron 
program, you may need to convert the day of the week names to numbers 
according to the schedule your cron uses (0—6 or 1—7) and use +%w (day of 
week as number) in place of +%a (locale ’s abbreviated weekday name): 


# Vixie-cron 
# Min Hour DoM Mnth DoW Program 
# 0-59 0-23 1-31 1-12 0-7 


# Vixie-cron requires % to be escaped or you get an error! 
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# Run the first Wednesday @ 23:00 

00 23 1-7 * Wed [ "S(date '+\%a')" == "Wed" ] && /path/to/command args 
to command 

# Run the second Thursday @ 23:00 

00 23 8-14 * Thu [ "S(date '+\%a')" == "Thu" ] && /path/to/command 
# Run the third Friday @ 23:00 

00 23 15-21 * Fri [ "S$(date '+\%a')" == "Fri" ] && /path/to/command 
# Run the fourth Saturday @ 23:00 

00 23 22-27 * Sat [ "$(date '+\%a')" == "Sat" ] && /path/to/command 
# Run the fifth Sunday @ 23:00 

00 23 28-31 * Sun [ "S(date '+\%a')" == "Sun" ] && /path/to/command 


WARNING 


Note that any given day of the week doesn’t always happen five times during 
one month, so be sure you really know what you are asking for if you schedule 
something for the fifth week of the month. 


Also note that Vixie-cron requires a % to be escaped or you get an error like 
“Syntax error: EOF in backquote substitution.” Other versions of cron may not 
require this, so check your manpage. 


NOTE 


If cron seems like it’s not working, try restarting your MTA (e.g., sendmail). 
Some versions of cron on some systems, such as Vixie-cron on Red Hat, are 
tied into the sendmail process. 


Discussion 


Most versions of cron (including Linux’s Vixie-cron) do not allow you to 
schedule a job on the Nth day of the month. To get around that, we schedule 
the job to run during the range of days when the Nth day we need occurs, 
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then check to see if it is the correct day on which to run. The “second 
Wednesday of the month” must occur somewhere in the range of the 8th to 
14th day of the month, so we simply run every day and see if it’s Wednesday. 
If so, we execute our command. 


Table 11-2 shows the ranges noted in the solution. 


Table 11-2. Day ranges for each 
week of a month 


Week Day range 
First l to7 
Second 8 to 14 
Third 15 to 21 
Fourth 22 to 27 


Fifth (see previous warning) 28 to 31 


We know this almost seems too simplistic; check a calendar if you don’t 
believe us: 


$ cal 10 2006 

October 2006 
S M Tu W Th F S 
1 2 3 4 5 6 7 
8 9 10 11 12 13 14 
15 16 17 18 19 20 21 
22 23 24 25 26 27 28 


29 30 31 
$ 
See Also 


= man 5 crontab 


m man cal 
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11.10 Logging with Dates 


Problem 


You want to output logs or other lines with dates, but you want to avoid the 
overhead of shelling out to the date command. 


Solution 


As of bash 4 or newer, you can use printf '%(fmt)T' for dates and times: 
printf '%(%F %T)T; Foo Bar\n' '-1' 
You can also use printf to assign to a variable, so you can easily reuse it: 


printf -v today '%(%F)T' '-1' # Set Stoday = '2014-11-15' 


Discussion 
The '-1' argument is important, and inconsistent! The bash manpage says: 


Two special argument values may be used: -1 represents the current time, 
and -2 represents the time the shell was invoked. 


But the default behavior changed between bash 4.2 and 4.3. In 4.2, a null 
argument is treated as null, which will return the local time at the Unix epoch, 
which is almost certainly not what you want or expect. In 4.3 there is a 
special exception so that a null argument is treated as a '-1' argument. For 
example: 


$ echo $BASH_VERSION 
4.2.37(1)-release 


$ printf '%(%F %T %Z)T; Foo Bar\n' 
1969-12-31 19:00:00 EST; Foo Bar 


$ printf '%(%F %T %Z)T; Foo Bar\n' '-1' 
2014-11-15 15:24:26 EST; Foo Bar 
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$ echo $BASH_VERSION 
4.3.11(1)-release 


$ printf '%(%F %T %Z)T; Foo Bar\n' 
2014-11-15 15:25:02 EST; Foo Bar 


$ printf '%(%F %T %Z)T; Foo Bar\n' '-1' 
2014-11-15 15:25:05 EST; Foo Bar 


NOTE 


The printf in bash is a builtin command, but there is also a separate binary 
executable called printf which isn’t the same. The separate executable is for 
other shells that don’t have a builtin printf. So, don’t confuse the manpage for 
printf with the description of printf that is part of the bash manpage. Though 
there are large similarities between the two, the latter is what you want. 


See Also 


m man date 

m man strftime 

m Recipe 15.15, “Using logger Correctly” 

= Recipe 15.17, “Automating a Process Using Phases” 


= Recipe 17.18, “Writing to a Circular Log” 
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Chapter 12. End-User Tasks as 
Shell Scripts 


You have seen a lot of smaller scripts and syntax up to now. Our examples 
have, of necessity, been small in scale and scope. Now we would like to show 
you a few larger (though not large) examples. They are meant to give you 
useful, real-world examples of actual uses of shell scripts beyond just system 
administration tasks. We hope you find them useful or usable. More than that, 
we hope you learn something about bash by reading through them and maybe 
trying them yourself or even tweaking them for your own use. 


12.1 Starting Simple by Printing Dashes 


Problem 


You want a simple script that prints a line of dashes. 


Solution 


Printing a line of dashes with a simple command might sound easy—and it is. 
But as soon as you think you’ve got a simple script, it begins to grow. What 
about varying the length of the line of dashes? What about changing the 
character from a dash to a user-supplied character? Do you see how easily 
feature creep occurs? Can we write a simple script that takes those extensions 
into account without getting too complex? 


Consider the script in Example 12-1. 


Example 12-1. ch12/dash 


#!/usr/bin/env bash 
# cookbook filename: dash 
# dash - print a Line of dashes 
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# options: # how many (default 72) 

# -c X use char X instead of dashes 

# 

function usagexit ( ) 

{ 
printf "usage: %s [-c X] [#]\n" ${0##*/} @ 
exit 2 

} >&2 


LEN=72 (2) 
CHAR=' - ' 
while (( $# > 0 )) (3) 
do 
case $1 in 
[0-9]*) LEN=$1;; 
-c) shift 
CHAR=$1;; 
*) usagexit;; 16] 
esac 
shift 
done 


oo 


if (( LEN > 4096 )) (7 
then 

echo "too large" >&2 

exit 3 
fi 


# build the string to the exact length 
DASHES="" 
for ((i=0; i<LEN; i++)) 
do 
DASHES="${DASHES}${CHAR}" 
done 
printf "%s\n" "DASHES" 


Discussion 


The basic task is accomplished by building a string of the required number of 
dashes (or an alternate character) and then printing that string to standard 
output (STDOUT). That takes only the last six lines. The default values are 
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set early in the script, before the while loop. All the other lines deal with 
argument parsing, error checking, user messages, and comments. 


You will find that this is pretty typical for a robust end-user script. Less than 
20 percent of the code does most of the “real” work—but that other 80 
percent of the code is what makes the script usable and “friendly” for your 
users. 

@ Here we use the string manipulation operator with a pattern (*/) to trim 
off any leading pathname characters when displaying this script’s name. 
That way no matter how the user invokes the script (for example, 
./dashes, /home/username/bin/dashes, or even ../../over/there/dashes), it 
will still be referred to as just dashes in the usage message. 

@ The default values are set with the two assignments here. 

@ The argument parsing is done while there are some arguments to parse. 
As arguments are handled, each shift builtin will decrement the number 
of arguments and eventually get us out of the while loop. 

g There are only two possible allowable arguments: specifying a number 
for the length, and 

@ a-c option followed by a character, to be used instead of the dash. 

@ Any other options will end up here and result in the usage message and an 
early exit. 

g@ Finally, notice that the script enforces a maximum length here, though it 
is completely arbitrary. Would you keep or remove such a restriction? 


We could be more careful in parsing the -c and its argument. Because we 
don’t use more sophisticated parsing (e.g., with getopts; see Recipe 13.1), our 
code requires the option and its argument to be separated by whitespace. (In 
running the script one must type, for example, -c 25 and not -c25.) We 
don’t even check to see that the second argument is supplied at all. 
Furthermore, the user might type not just a single letter but a whole string. 
(Can you think of a simple way to limit this, by taking only the first character 
of the argument? Do you need/want to? Why not let the user specify a string 
instead of a single character?) 


The parsing of the numerical argument could also use some more 
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sophisticated techniques. The patterns in a case statement follow the rules of 
pathname expansion and are not regular expressions. It might be tempting to 
assume that the case pattern [0-9]* means “only digits,” but that would be 
the regular expression meaning. In the case statement it means any string 
that begins with a digit. Not catching erroneous input like 9.5 or 612more 
will result in errors in the script later on. The use of an if statement with =~ 
and its more sophisticated regular expression matching might be useful here. 
You can see from this example that even simple scripts can become quite 
involved, mostly due to error checking, argument parsing, and the like. For 
scripts that you write for yourself, such techniques are often glossed over or 
skipped entirely—after all, as the only user of the script you know the proper 
usage and are willing to use it correctly or have it fail in an ugly display of 
error messages. For scripts that you want to share, however, such is not the 
case, and much care and effort will likely be put into toughening up your 
script. 


See Also 


= Recipe 5.8, “Looping Over Arguments Passed to a Script” 
= Recipe 5.11, “Counting Arguments” 

= Recipe 5.12, “Consuming Arguments” 

= Recipe 5.20, “Using bash for basename” 

= Recipe 6.15, “Parsing Command-Line Arguments” 


= Recipe 13.1, “Parsing Arguments for Your Shell Script” 


12.2 Viewing Photos in an Album 


Problem 


You have a directory full of images you just downloaded from your digital 
camera. You want a quick and easy way to view them all, so that you can 
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pick out the good ones. 


Solution 


Write a shell script that will generate a set of HTML pages so that you can 
view your photos with a browser. Call it mkalbum and put it somewhere like 
your ~/bin directory. 


On the command line, cd into the directory where you want your album 
created (typically where your photos are located). Then run some command 
that will generate the list of photos that you want included in this album (e.g., 
ls *. jpg, but see also Recipe 9.5), and pipe this output into the mkalbum 
shell script in Example 12-2, which we will explain later. You need to put the 
name of the album (1.e., the name of a directory that will be created by the 
script) on the command line as the only argument to the shell script. It might 
look something like this: 


ls *.jpg | mkalbum rugbymatch 


Figure 12-1 shows a sample of the generated web page. 
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,. /RoughAction1.jpg - Mozilla Firefox 


File Edit View Go Bookmarks Tools Help 
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./RoughAction1.jpg 


First Prev Next Last 


Figure 12-1. Sample mkalbum web page 


The large title is the name of the photo (1.e., the filename); there are 
hyperlinks to other pages for the first, last, next, and previous photos in the 
album. 


Example 12-2 is the shell script (mkalbum) that will generate the set of 
HTML pages for your album, one page per image (the line numbers are not 
part of the script, but are included to make it easier to discuss). 


Example 12-2. ch12/mkalbum 


#!/usr/bin/env bash (1 

# cookbook filename: mkalbum 
mkalbum - make an HTML "album" of a pile of photo files. 
ver. 0.2 


It will be created in the current directory. 


An album page is the HTML to display one photo, with 


# 

# 

# 

# An album is a directory of HTML pages. 

# 

# 

# 

# a title that is the filename of the photo, along with 
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# hyperlinks to the first, previous, next, and Last photos. 
# 

# ERROUT 

ERROUT ( ) 2 ] 


{ 
printf "yh" "So" 
} >&2 © 


# 
# USAGE 

USAGE( ) (a 
{ 


} 


ERROUT "usage: %s <newdir>\n" S{O##*/} (5) 


# EMIT(thisph, startph, prevph, nextph, lastph) 
EMIT() @ 
{ 
THISPH=". ./$1" 
STRTPH="${2%.*}.html" 
PREVPH="${3%.*}. html" 
NEXTPH="${4%.*}.html" 
LASTPH="${5%.*}. html" 
if [ -z "$3" ] 
then 
PREVLINE='<TD> Prev </TD>' 
else 
PREVLINE='<TD> <A HREF="'SPREVPH'"> Prev </A> </TD>' 
fi 
if [ -z "$4" ] 
then 
NEXTLINE='<TD> Next </TD>' 
else 
NEXTLINE='<TD> <A HREF="'SNEXTPH'"> Next </A> </TD>' 
fi 
cat <<EOF 7] 
<HTML> 
<HEAD><TITLE>$THISPH</TITLE></HEAD> 
<BODY> 
<H2>$THISPH</H2> 
<TABLE WIDTH="25%"> 
<TR> 
<TD> <A HREF="SSTRTPH"> First </A> </TD> 
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SPREVLINE 
SNEXTLINE 
<TD> <A HREF="$LASTPH"> Last </A> </TD> 
</TR> 
</TABLE> 
<IMG SRC="STHISPH" alt="$THISPH" 
BORDER="1" VSPACE="4" HSPACE="4" 
WIDTH="800" HEIGHT="600" /> 
</BODY> 
</HTML> 
EOF 
} 


if (( $# != 1 )) 
then 
USAGE 
exit -1 
fi 
ALBUM="$1" 
if [ -d "S{ALBUM}" ] 
then 
ERROUT "Directory [%s] already exists.\n" ${ALBUM} 
USAGE 
exit -2 
else 
mkdir "ALBUM" 
fi 
cd "ALBUM" 


PREV="" 
FIRST= wee 
LAST=" last" 


while read PHOTO 
do 
# prime the pump 
if [ -z "${CURRENT}" ] 
then 
CURRENT="$PHOTO" 
FIRST="$PHOTO" 
continue 
fi 
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PHILE=${CURRENT##* / } # remove any Leading path 
EMIT "SCURRENT" "SFIRST" "SPREV" "SPHOTO" "SLAST" > 
"S{PHILE%.*}. html" 


# set up for next iteration 
PREV="SCURRENT " 
CURRENT="S$PHOTO" 


done 


PHILE=${CURRENT##*/} # remove any leading pathname 
EMIT "SCURRENT" "SFIRST" "PREV" "" "SLAST" > "S{PHILE%.*}. html" 


# make the symlink for "Last" 
ln -s "S{PHILE%.*}.html" ./last.html 8 ] 


# make a link for index.html 
ln -s "S{FIRST%.*}.html" ./index. html 


Discussion 


While there are plenty of free or inexpensive photo viewers, using bash to 
build a simple photo album helps to illustrate the power of shell 
programming, and gives us a meatier example to discuss. 

ọ The shell script begins with the special comment that defines which 
executable to use to run this script. Then follow some comments 
describing the script. Let’s just put in one more word encouraging you to 
be sure to comment your scripts. Even the sparsest comments will be 
worth something 3 days or 13 months from now when you wish you 
could remember what this script was all about. 


@ After the comments we have put our function definitions. The ERROUT 
function will act very much like printf (since all it does is invoke printf), 
but with the added twist that it redirects its output to standard error. This 
saves you from having to remember to redirect the output on every printf 
of error messages. 

@ While normally we put the redirection at the end of a command, here it is 
put at the end of a function definition to tell bash to redirect all output 
that emanates from this function. 
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© The USAGE function, while not strictly necessary as a separate function, is 
a handy way to document up front how you expect your script to be 
invoked. Rather than hardcoding the name of the script in our usage 
message, we like to use the $0 special variable in case the script is 
renamed. The $0 is the name of the script as it was invoked, including 
any pathname if specified by the user. 


@ By using the ## operator we get rid of all that path noise (specified by the 
*/). 

@ The EMIT function is a larger function. Its purpose is to emit the HTML 
for each page of the album. Each page is its own (static) web page, with 
hyperlinks to the previous and next image as well as links to the first and 
last image. The EMIT function doesn’t know much; it is given the names 
of all the images to which to link, and it takes those names and converts 
them to page names, which for our script are the same as the image name 
but with the file extension changed to .html. So, for example, if $2 held 
the filename pict001 jpg, the result of ${2%.*}. html would be 
pictO01 html. 

@ Since there is so much HTML to emit, rather than have printf after 
printf statement, we use the cat command and a here-document to allow 
us to type the literal HTML in the script, line after line, with shell variable 
expansion being applied to the lines. The cat command is simply copying 
(concatenating) STDIN to the STDOUT. In our script we redirect STDIN 
to take its input from the succeeding lines of text; i.e., a here-document. 
By not quoting the end-of-input word (just EOF and not 'EOF' or \EOF) 
we ensure that bash will continue to do variable substitution on our input 
lines, enabling us to use variable names based on our parameters for 
various titles and hyperlinks. 

@ The last two commands in the script create symbolic links as shortcuts to 
the first and last photos. This way the script doesn’t need to figure out the 
names of the first and last pages of the album; it just uses the hardcoded 
names index.html and last.html, respectively, when generating all the 
other album pages. Then, as a last step, since the last filename processed 
is the last photo in our album, it creates the link to it. Similarly, with the 
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first page (although we know that name right away), we waited until the 
end to put it with the other symbolic link, just as a matter of style—to 
keep the two similar operations in proximity. 


One last thought about the design of this script: we could have passed in a 
filename to the EMIT function and had EMIT redirect its own output to that 
file, but such redirection was not really logically a part of the EMIT idea (c.f. 
our ERROUT function, whose whole purpose is the redirection). The purpose of 
EMIT is to create the HTML; where we send that HTML is another matter. 
Because bash allows us to redirect output so easily, it is possible to make that 


a separate step. Besides, it was easier to debug when the method just wrote its 
output to STDOUT. 


See Also 


a http://www.w3schools.com 


= HTML & XHTML: The Definitive Guide, 6th Edition, by Chuck Musciano 
and Bill Kennedy (O’ Reilly) 


= Recipe 3.2, “Keeping Your Data with Your Script” 

= Recipe 3.3, “Preventing Weird Behavior in a Here-Document” 
= Recipe 3.4, “Indenting Here-Documents” 

m Recipe 5.13, “Getting Default Values” 

= Recipe 5.14, “Setting Default Values” 

= Recipe 5.18, “Changing Pieces of a String” 

= Recipe 5.23, “Using Array Variables” 

= Recipe 9.5, “Finding Files Irrespective of Case” 


= Recipe 16.11, “Keeping a Private Stash of Utilities by Adding ~/bin” 


12.3 Loading Your MP3 Player 
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Problem 


You have a collection of MP3 files that you would like to put on your MP3 
player, but you have more music than can fit in its memory. How can you 
load your player with music without having to babysit it by dragging and 
dropping files until it is full? 


Solution 
Use a shell script like the one in Example 12-3 to keep track of the available 


space as it copies files onto the MP3 player, quitting when it is full. 


Example 12-3. ch12/oad_mp3 


#!/usr/bin/env bash 

# cookbook filename: load_mp3 

# Fill up my mp3 player with as many songs as will fit. 

# N.B.: This assumes that the mp3 player is mounted on /media/mp3 
# 


determine the size of a file 


unction FILESIZE () 


ah # HF 


FN=${1: -/dev/nuLll} 

if [[ -e SFN ]] 

then 
# FZ=$(stat -c '%b' "$FN") 
set -- $(ls -s "SFN") 
FZ=$1 

fi 

} 


# 

# compute the free space on the mp3 player 

# 

function FREESPACE 

{ 
# FREE=$(df /media/mp3 | awk '/^\/dev/ {print $4}') 
set -- $(df /media/mp3 | grep '^/dev/') 
FREE=$4 
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} 


# subtract the (given) file size from the (global) free space 
function REDUCE () 
(( FREE-=${1: -0})) # this works, but is unusual 


# 
# main: 
# 
let SUM=0 Oo 
let COUNT=0 
export FZ 
export FREE 
FREESPACE 
find . -name '*.mp3' -print | \ 
( while read PATHNM 
do 
FILESIZE "SPATHNM" 
if ((FZ <= FREE)) 
then 
echo loading SPATHNM 
cp "SPATHNM" /media/mp3 
if (( $? == 0 )) 
then 
let SUM+=FZ 
let COUNT++ 
REDUCE $FZ 
else 
echo "bad copy of $PATHNM to /media/mp3" 
rm -f /media/mp3/"S{PATHNM##* /}" 
# recompute because we don't know how far it got 
FREESPACE 
fi 
# any reason to go on? 
if (( FREE <= 0 )) 
then 
break 
fi 
else 
echo skipping SPATHNM 


fi 
done 
printf "Loaded %d songs (%d blocks)" SCOUNT $SUM 
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printf " onto /media/mp3 (%d blocks free)\n" SFREE 


) 
# end of script 


Discussion 


Invoke this script and it will copy any MP3 file that it finds from the current 
directory on down (toward the leaf nodes of the tree onto an MP3 player (or 
other device mounted on /media/mp3. The script will try to determine the 

free space on the device before it begins its copying, and then it will subtract 
the disk size of copied items so as to know when to quit (1.e., when the device 
is full, or as full as we can get it. 


The script is simple to invoke: 
load_mp3 


Then you can watch as it copies files, or you can go grab a cup of coffee—it 
depends on how fast your disk is and how fast your MP3 memory writes go. 
Let’s look at some bash features used in this script: 

ọ We'll start after the opening comments and the function definitions. 
(We’ll discuss the function definitions later.) The main body of the shell 
script starts by initializing some variables and exporting some variables 
so they will be available globally. 

@ Here we call the FREESPACE function to determine how much free space 
is available on the MP3 player before we begin copying files. 

@ The find command will locate all the MP3 files (actually, only those files 
whose names end in “.mp3”’). This information is piped into a while loop 
that begins on the next line. 

@ Why is the while loop wrapped inside of parentheses? The parentheses 
mean that the statements inside them will be run inside of a subshell. But 
what we’re concerned about here is that we group the while statement 
with the printf statements that come after the loop, near the very end of 
the script. Since each statement in a pipeline is run in its own subshell, 
and since the find pipes its output into the while loop, none of the 
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counting that we do inside the while loop will be available outside of that 
loop. Putting the while and the printfs inside of a subshell means, they 
are now both executing in the same shell environment and can share 
variables. A similar effect can be accomplished with braces. 


NOTE 
As of bash 4.4 the parentheses are no longer needed, provided that this is run as 


a shell script (not interactively) and the shell option Lastpipe is set, as would 


happen if you put shopt -s Lastpipe in the script somewhere before the find 
command. 


Let’s look inside the while loop and see what it’s doing: 


FILESIZE "SPATHNM" 
if ((FZ <= FREE)) 
then 
echo loading SPATHNM 
cp "SPATHNM" /media/mp3 
if (( $? == 0 )) 
then 


For each filename that it reads from the find command’s output, it will use 
the FILESIZE function (discussed momentarily) to determine the size of that 
file. Then it checks to see if the file is smaller than the remaining disk space; 
1.e., Whether there is room for this file. If so, it will echo the filename so we 
can see what it’s doing and then it will invoke cp to copy the file onto the 
MP3 player. 


It’s important to check and see if the copy command completed successfully. 
The $? is the result of the previous command, so it represents the result of the 
cp command. If the copy is successful, then we can deduct the copied file’s 
size from the space available on the MP3 player. But if it failed, then we need 
to try to remove the copy (since, if it is there at all, it will be incomplete). We 
use the -f option on rm so as to avoid error messages if the file never got 
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created. Then we recalculate the free space to be sure that we have the count 
right. (After all, the copy might have failed because somehow our estimate 
was wrong and we really are out of space. 


In the main part of the script, all three of our if statements use the double 
parentheses around the expression. All three are numerical if statements, and 
we wanted to use the familiar operators (e.g., <= and ==. These same if 
conditions could have been checked using the square bracket ([ form of the 
if statement, but then the operators would be - le and -eq. We do use a 
different form of the if statement in the FILESIZE function. There we need 
to check the existence of the file (whose name is in the variable $FN. That is 
simple to write with the -e operator, but that is not available to the 
arithmetic-style if statement (i.e., when using parentheses instead of square 
brackets. 


Speaking of arithmetic expressions, let’s take a look at the REDUCE function 
and see what’s going on there: 


function REDUCE ( ) 
(( FREE-=${1:-0})) # this works, but is unusual 


Most people write functions using curly braces to delimit the body of the 
function. However, in bash, any compound statement will work. In this case 
we chose the double parentheses of arithmetic evaluation, since that is all we 
need the function to do, but this is unusual and could cause readability and 
maintainability confusion unless well commented. Whatever value is 
supplied on the command line that invokes REDUCE will be the first 
(positional) parameter (1.e., $1). We simply subtract that value from $FREE to 
get the new value for $FREE. That is why we used the arithmetic expression 
syntax—so that we can use the -= operator. 


While we are looking at the functions, let’s look at two lines in the FILESIZE 
function. The comment in the script shows another simple way to do this, but 
we want to explain a more general technique useful for more interesting 
purposes than just checking file sizes. Take a close look at these lines: 
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set -- $(ls -s "SFN") 
FZ=$1 


There is a lot going on in those few characters. First, the /s command is run 
inside of a subshell (the $() construct). The -s option on Zs gives us the size, 
in blocks, of the file along with the filename. The output of the command is 
returned as words on the command line for the set command. The purpose of 
the set command here is to parse the words of the /s output. There are lots of 
ways we could do that, but this approach is a useful technique to remember. 


The set -- will take the remaining words on the command line and make 
them the new positional parameters. If you write set -- this is a test, 
then $1 is this and $3 is a. The previous values for $1, $2, etc. are lost, but 
in our script we saved into $FN the only parameter that gets passed into this 
function. Having done so, we are free to reuse the positional parameters, and 
we use them by having the shell do the parsing for us. We can then get at the 
file size as $1, as you see in the assignment to $FZ. (By the way, in this case, 
since this is inside a function, it is only the function’s positional parameters 
that are changed, not those from the invoking of the script.) 


We use this technique of having the shell do our parsing for us again, in the 
other function: 


set -- $(df /media/mp3 | grep '*/dev/') 
FREE=$4 


The output of the df command will report on the size, in blocks, available on 
the device. We pipe the output through grep, since we only want the one line 
with our device’s information and we don’t want the heading line that df 
produces. Once bash has set our arguments, we can grab the free space on the 
device as $4. 


A comment in the script shows an alternative way to parse the output of the 
df command. We could just pipe the output into awk and let it parse the 
output from df for us: 


# FREE=$(df /media/mp3 | awk '/4\/dev/ {print $4}') 
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In this version, by using the expression in slashes we tell awk to pay attention 
only to lines with a leading /dev. (The caret anchors the search to the 
beginning of the line and the backslash escapes the meaning of the slash, so 
as not to end the search expression at that point and to include a slash as the 
first character to find. 


So which approach to use? They both involve invoking an external program, 
in one case grep and in the other awk. There are usually several ways to 
accomplish the same thing (in bash as in life, so the choice is yours. In our 
experience, it usually comes down to which one you think of first. 


See Also 


m man df 

m man grep 

= man awk 

= Recipe 10.4, “Defining Functions” 

= Recipe 10.5, “Using Functions: Parameters and Return Values” 


= Recipe 19.8, “Forgetting that Pipelines Make Subshells” 


12.4 Burning a CD 


Problem 


You have a directory full of files on your Linux system that you would like to 
burn to a CD. Do you need an expensive CD burning program, or can you do 
it with the shell and some open source programs? 


Solution 


You can do it with two open source programs called mkisofs and cdrecord, 
and a bash script to help you keep all the options straight. 
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Start by putting all the files that you want to copy to CD into a directory 
structure. The script in Example 12-4 will take that directory, make an ISO 
filesystem image from those files, then burn the ISO image. All it takes is a 
bunch of disk space and a bit of time—but you can get up and wander while 
the bash script runs. 


WARNING 


This script may not work on your system. We include it here as an example of 
shell scripting, not as a workable CD recording and backup mechanism. 


Example 12-4. ch12/cdscript 


#!/usr/bin/env bash 
# cookbook filename: cdscript 
# cdscript - prep and burn a CD froma dir. 


# 

# usage: cdscript dir [ cddev ] 

# 

if (( $# < 1 || $# > 2 )) 

then 
echo 'usage: cdscript dir [ cddev ]' 
exit 2 

fi 


# set the defaults 

SRCDIR=$1 

# your device might be "ATAPI:0,0,0" or other digits 
CDDEV=${2: -"ATAPI:0,0,0"} 

ISOIMAGE=/tmp/cd$$.iso oO 


echo "building ISO image..." 
# 
# make the ISO fs image 
# 
mkisofs -A "S(cat ~/.cdAnnotation)" \ 
-p "S(hostname)" -V "S{SRCDIR##*/}" \ 
-r -o "SISOIMAGE" SSRCDIR 
STATUS=$? 12) 
if (( STATUS != 0 )) 
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then 
echo "Error. ISO image failed." 
echo "Investigate then remove SISOIMAGE" 
exit SSTATUS 

fi 


echo "ISO image built; burning to cd..." 
# 
# burn the CD 
# 
SPD=8 
OPTS="-eject -v fs=64M driveropts=burnproof" 
cdrecord SOPTS -speed=$SPD dev=${CDDEV} SISOIMAGE 
STATUS=$? (3 
if (( STATUS != 0 )) 
then 
echo "Error. CD Burn failed." 
echo "Investigate then remove $ISOIMAGE" 
exit STATUS 
fù 


rm -f SISOIMAGE 
echo "Done." 


Discussion 


Here is a quick look at some of the odder constructs in this script 

@ We construct a temporary filename by using the $$ variable, which gives 
us our process number. As long as this script is running, it will be the one 
and only process of that number, so this gives us a name that is unique 
among all other running processes. (See Recipe 14.11 for a better way.) 

@ We save the status of the mkisofs command. Well-written Unix and Linux 
commands (and bash shell scripts) will return 0 on success (1.e., if 
nothing went wrong) and a nonzero value if they fail. We could have just 
used the $? in the if statement on the next line, but we want to hold on to 
the status from the mkisofs command so that, in the event of failure, we 
can pass that value back out as the return value of this script. 

@ We do the same with the cdrecord command, saving its return value, so 
that if the command fails, the if statement would therefore be true, so 
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then the exit statement can send back that failure code. 


It may take a bit of thought to unpack these lines: 


mkisofs -A "S(cat ~/.cdAnnotation)" \ 
-p "S(hostname)" -V "S{SRCDIR##*/}" \ 
-r -o "SISOIMAGE" SSRCDIR 


All three lines are just a single line of input to bash that has been separated 
across lines by putting a backslash as the very last character on the line in 
order to escape the normal meaning of an end of line. Be sure you don’t put a 
space after the trailing \. But that’s just the tip of the iceberg here. There are 
three subshells that are invoked whose output is used in the construction of 
the final command line that invokes mkisofs. 


First there is an invocation of the cat program to dump the contents of a file 
called .cdAnnotation located in the home directory (~/) of the user invoking 
this script. The purpose is to provide a string to the -A option, which the 
mkisofs manpage describes as “a text string that will be written into the 
volume header.” Similarly, the -p option wants another such string, this time 
indicating the preparer of the image. For our script it seemed like it might be 
handy to put the hostname where the script is run as the preparer, so we run 
hostname in a subshell (though using the builtin SHOSTNAME is more 
efficient). Finally, the volume name is specified with the -V parameter, and 
for that we use the name of the directory where all the files are found. That 
directory is specified on the command line to our script, and we use the ## 
operator to peel off the leading directory pathname (using the pattern */), if 
any (so, for example, /usr/local/stuff becomes just stuff). 


See Also 


= Recipe 5.20, “Using bash for basename” 


= Recipe 14.11, “Using Secure Temporary Files” 
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12.5 Comparing Two Documents 


Problem 


It is easy to compare two text files (see Recipe 17.10). But what about 
documents produced by your suite of office applications? They are not stored 
as text, so how can you compare them? If you have two versions of the same 
document, and you need to know what the content changes are (if any) 
between the two versions, is there anything you can do besides printing them 
out and comparing page after page? 


Solution 


First, use an office suite such as LibreOffice that will let you save your 
documents in OpenDocument Format (ODF). Once you have your files in 
ODF, you can use a shell script to compare just the content of the files. We 
stress the word content here because the formatting differences are another 
issue, and it is (usually) the content that is the most important determinant of 
which version is newer or more important to the end user. 


Example 12-5 is a bash script that can be used to compare two LibreOffice 
files, which are saved in ODF (but use the conventional suffix .odt to indicate 
a text-oriented document, as opposed to a spreadsheet or a presentation file). 


Example 12-5. ch12/oodiff 


#!/usr/bin/env bash 
# cookbook filename: oodiff 
# oodiff -- diff the CONTENTS of two OpenOffice/LibreOffice files 
# works only on .odt files 
# 
function usagexit () 
{ 
echo "usage: ${0##*/} file1 file2" 
echo "where both files must be .odt files" 
exit $1 
} >&2 9 


# assure two readable arg filenames which end in .odt 
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if (( $# !=2 )) 
then 
usagexit 1 
fi 
if [[ $1 != *.odt || $2 != *.odt ]] 
then 
usagexit 2 
fi 
if [[ ! -r $1 || ! -r $2 ]] 
then 
usagexit 3 
fi 


BAS1=$(basename "$1" .odt) 
BAS2=$(basename "$2" .odt) 


# unzip them someplace private 
PRIV1="/tmp/${BAS1}.$$_1" 
PRIV2="/tmp/${BAS2}.$$_2" 


# make absolute 


HERE=SPWD 
if [[-S{1:0:1} == *f" <1] @ 
then 
FULL1="${1}" 
else 
FULL1="${HERE}/${1}" 
fi 


# make absolute 
if [[ ${2:0:1} == '/' ]] 
then 
FULL2="${2}" 
else 
FULL2="${HERE}/${2}" 
fi 


# mkdir scratch areas and check for failure 
# N.B. must have whitespace around the { and } and 


# must have the trailing ; in the {} lists 
mkdir "$PRIV1" || { echo "Unable to mkdir 'SPRIV1'" ; exit 4; } 
mkdir "$PRIV2" || { echo "Unable to mkdir 'SPRIV2'" ; exit 5; } 
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cd "SPRIV1" 

unzip -q "SFULL1" 

sed -e 's/>/>\ © 
jy -e 's/</ 


</g' content.xml > contentwnl.xml 


cd "SPRIV2" 

unzip -q "S$FULL2" 
sed -e 's/>/>\ 
/9'" -e 's/</\ 


</g' content.xml > contentwnl.xml 
cd "HERE" 
diff "${PRIV1}/contentwnl.xml" "${PRIV2}/contentwnl.xml" 


rm -rf "SPRIV1" "SPRIV2" 


Discussion 


Underlying this script is the knowledge that LibreOffice files are stored like 
ZIP files. Unzip them and there are a collection of XML files that define your 
document. One of those files contains the content of your document; that is, 
the paragraphs of text without any formatting (but with XML tags to tie each 
snippet of text to its formatting. The basic idea behind the script is to unzip 
the two documents and compare the content pieces using diff, and then clean 
up the mess that we’ve made. 


One other step is taken to make the diffs easier to read. Since the content is 
all in XML and there aren’t a lot of newlines, the script will insert a newline 
after every opening tag and before every end tag (tags whose contents begin 
with a slash, asin </ ... >. While this introduces a lot of blank lines, it 
also enables diff to focus on the real differences: the textual content. 


As far as shell syntax goes, you have seen all this in other recipes in the book, 

but it may be worth explaining a few pieces of syntax just to be sure you can 

tell what is going on in the script. 

g This line redirects all the output from this shell function to STDERR. 
That seems appropriate since this is a help message, not the normal output 
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of this program. Putting the redirect on the function definition means we 
don’t need to remember to redirect every output line separately. 

@ This contains the terse expression if [[ ${1:0:1} == '/' ]], which 
checks to see whether the first argument begins with a slash character. 
The ${1:0:1} is the syntax for a substring of a shell variable. The 
variable is ${1}, the first positional parameter. The :0:1 syntax says to 
start at an offset of zero and that the substring should be one character 
long. 

@ The lines of this sed command may be a little hard to read because they 
involve escaping the newline character so that it becomes part of the sed 
substitution string. The substitution expression takes each > in the first 
substitution and each < in the second, and replaces it with itself plus a 
newline. We do this to our content file in order to spread out the XML 
and get the content on lines by itself. That way the diff doesn’t show any 
XML tags, just content text. 


See Also 

= Recipe 8.7, “Uncompressing Files” 

= Recipe 13.3, “Parsing Some HTML” 

= Recipe 14.11, “Using Secure Temporary Files” 
m Recipe 17.3, “Unzipping Many ZIP Files” 

= Recipe 17.10, “Using diff and patch” 
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Chapter 13. Parsing and 
Similar Tasks 


This is a chapter of tasks that programmers might recognize. The recipes here 
aren’t necessarily more advanced than the other bash script recipes in the 
book, but if you are not a programmer, these tasks might seem obscure or 
irrelevant to your use of bash. We won’t do much explaining of the reasons 
why you’d find yourself in these situations (as a programmer, you’ ll 
recognize some if not all of them). Even if you don’t recognize the situations, 
though, you should read them for what you can learn about bash. 


Some of the recipes in this chapter include the parsing of command-line 
arguments. Recall that the typical way to specify options on a shell script is to 
have a leading minus sign and a single letter. For example, an option for your 
script to give fewer messages might use -q as a flag to mean quiet mode. 
Sometimes an option might take an argument. For example, a user option 
where you need to specify a username might use -u followed by the 
username. This distinction will be made clear in this chapter’s first recipe. 


Some Linux commands also allow long-form options. Using the previous 
example of a short-format -u option, a command might also support a long 
format like --user=username. We will not be showing any long-format 
options, though they could be used for some of the techniques that we show. 
The best way to parse long arguments is to use the getopt (note no s) 
command. 


13.1 Parsing Arguments for Your Shell Script 


Problem 


You want to have some options on your shell script, some flags that users can 
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use to alter its behavior. You could do the parsing directly, using ${} to tell 
you how many arguments have been supplied and using ${1:0:1} to test the 
first character of the first argument to see if it is a minus sign. You would 
need some if/then or case logic to identify which option it is and whether it 
takes an argument, though. And what if the user doesn’t supply a required 
argument, or calls your script with two options combined (e.g., -ab? Will 

you also parse for that? The need to parse options for a shell script is a 
common situation. Lots of scripts have options. Isn’t there a more standard 
way to do this? 


Solution 
Use bash’s builtin getopts command to help parse options. 


Example 13-1, based largely on the example in the manpage for getopts, 
illustrates. 


Example 13-1. ch13/getopts_example 


#!/usr/bin/env bash 
# cookbook filename: getopts_example 
# 
# using getopts 
# 
af Lag= 
bflag= 
while getopts 'ab:' OPTION 
do 
case SOPTION in 
a) aflag=1 
b) bflag=1 
bval="SOPTARG" 


printf "Usage: %s: [-a] [-b value] args\n" S{O##*/} >&2 
exit 2 


? 


4 


esac 
done 
shift S$((SOPTIND - 1)) 
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if [ "Saflag" ] 
then 
printf "Option -a specified\n" 
fi 
if [ "$bflag" ] 
then 
printf 'Option -b "%s" specified\n' "$bval" 
fi 
printf "Remaining arguments are: %s\n" "$*" 


Discussion 


There are two kinds of options supported here. The first and simpler kind is 
an option that stands alone. It typically represents a flag to modify a 
command’s behavior. An example of this sort of option is the -l option on 
the /s command. The second kind of option requires an argument. An 
example of this is the mysql command’s -u option, which requires that a 
username be supplied, as in mysql -u sysadmin. Let’s look at how getopts 
supports the parsing of both kinds. 


getopts takes two arguments: 
getopts 'ab:' OPTION 


The first is a list of option letters. The second is the name of a shell variable. 
In our example we are defining -a and -b as the only two valid options, so 
the first argument to getopts has just those two letters. ..and a colon. What 
does the colon signify? It indicates that -b needs an argument, just like -u 
username or -f filename might be used. The colon needs to be adjacent to 
any option letter taking an argument. For example, if only -a took an 
argument we would need to write a:b instead. 


The getopts builtin will set the variable named in the second argument to the 
value that it finds when it parses the shell script’s argument list ($1, $2, etc.). 
If it finds an argument with a leading minus sign, it will treat that as an option 
argument and put the letter into the given variable (SOPTION in our example). 
Then it returns true (1.e., 0) so that the while loop will process the option, 
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then continue to parse options by repeated calls to getopts until it runs out of 
arguments (or encounters a double minus, - -, which allows users to put an 
explicit end to the options. Then getopts returns false (1.e., a nonzero value 
and the while loop ends. 


Inside the loop, when the parsing has found an option letter for processing, 
we use a Case statement on the variable SOPTION to set flags or otherwise 
take action when the option is encountered. For options that take arguments, 
that argument is placed in the shell variable SOPTARG (a fixed name not 
related to our use of SOPTION as our variable. We need to save that value by 
assigning it to another variable because as the parsing continues to loop, the 
variable SOPTARG will be reset on each call to getopts. 


The third case of our case statement is a question mark, a shell pattern that 
matches any single character. When getopts finds an option that is not in the 
set of expected options ('ab:' 
question mark in the variable (SOPTION in our example. So, we could have 
made our case statement read \? or '?' for an exact match, but the ? as a 
pattern match of any single character provides a convenient default. It will 
match a literal question mark as well as matching any other single character. 


in our example, it will return a literal 


In the usage message that we print, we have made two changes from the 
example script in the manpage. First, we use ${0##*/} to give the name of 
the script without the pathname that may have been part of how it was 
invoked. Secondly, we redirect this message to standard error (>&2 because 
that is really where such messages belong. All of the error messages from 
getopts that occur when an unknown option or missing argument is 
encountered are written to standard error; we add our usage message to that 
chorus. 


When the while loop terminates, we see the next line to be executed is: 
shift $((SOPTIND - 1)) 


which is a shift statement used to move the positional parameters of the 
shell script from $1, $2, etc. down a given number of positions (tossing the 
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lower ones. The variable SOPTIND is an index into the arguments that getopts 
uses to keep track of where it is when it parses. Once we are done parsing, we 
can toss all the options that we’ve processed by executing this shift 
statement. For example, if we had this command line: 


myscript -a -b alt plow harvest reap 


then after parsing for options, SOPTIND would be set to 4. By doing three 
SOPTIND - 1 shifts we would get rid of the options, and then a quick echo 
$* would give this: 


plow harvest reap 


The remaining (nonoption) arguments will then be ready for use in our script 
(in a for loop, perhaps). In our example script, the last line is a printf 
showing all the remaining arguments. 


See Also 


m help case 

=» help getopts 

=» help getopt 

= Recipe 5.8, “Looping Over Arguments Passed to a Script” 
= Recipe 5.11, “Counting Arguments” 

= Recipe 5.12, “Consuming Arguments” 

= Recipe 5.18, “Changing Pieces of a String” 

= Recipe 5.20, “Using bash for basename” 

= Recipe 6.10, “Looping for a While” 

= Recipe 6.14, “Branching Many Ways” 


= Recipe 6.15, “Parsing Command-Line Arguments” 
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= Recipe 13.2, “Parsing Arguments with Your Own Error Messages” 


13.2 Parsing Arguments with Your Own Error 
Messages 


Problem 


You are using getopts to parse the options for your shell script, but you don’t 
like the error messages that it writes when it encounters bad input. Can you 
still use getopts but write your own error handling? 


Solution 


If you just want getopts to be quiet and not report any errors at all, just assign 
OPTERR=0 before you begin parsing. But if you want getopts to give you more 
information without the error messages, then begin the option list with a 
colon, as shown in the script in Example 13-2. (The quotes around the option 
list are optional.) 


Example 13-2. ch13/getopts_custom 


#!/usr/bin/env bash 
# cookbook filename: getopts_custom 
# 
# using getopts - with custom error messages 
# 
af Lag= 
bflag= 
# since we don't want getopts to generate error 
# messages, but want this script to issue its 
# own messages, we will put, in the option list, a 
# leading ':' to silence getopts. 
while getopts :ab: FOUND 
do 

case $FOUND in 

a) aflag=1 


b) bflag=1 
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bval="SOPTARG" 


\:) printf "argument missing from -%s option\n" SOPTARG 
printf "Usage: %s: [-a] [-b value] args\n" ${0##*/} 
exit 2 


\?) printf "unknown option: -%s\n" SOPTARG 
printf "Usage: %s: [-a] [-b value] args\n" ${0##*/} 
exit 2 
esac >&2 
done 
shift $((SOPTIND - 1)) 


if [ "Saflag" ] 
then 
printf "Option -a specified\n 
fi 
if [ "$bflag" ] 
then 
printf ‘Option -b "%s" specified\n' "Sbval" 
fi 
printf "Remaining arguments are: %s\n" "$*" 


Discussion 


The script is very similar to the one in Recipe 13.1; see that recipe’s 
Discussion section for more background. One difference here is that getopts 
may now return a colon. It does so when an option is missing (e.g., when the 
user invokes the script with -b but without an argument for it). In that case, it 
puts the option letter into SOPTARG so that you know what option it was that 
was missing its argument. 


Similarly, if an unsupported option is given (e.g., if the user tries -d when 
invoking the script) getopts returns a question mark as the value for SFOUND, 
and puts the letter (the d in this case) into SOPTARG so that it can be used in 
the error messages. 


We put a backslash in front of both the colon and the question mark to 
indicate that these are literals and not any special patterns or shell syntax. 
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While not necessary for the colon, it looks better to have the parallel 
construction with the two punctuation marks both being escaped. 


We added an I/O redirection on the esac (the end of the case statement, so 
that all output from the various printf commands will be redirected to 
standard error. This is in keeping with the purpose of standard error and is 
just easier to put it here than remembering to put it on each printf 
individually. 


See Also 


m help case 

=» help getopts 

= help getopt 

= Recipe 5.8, “Looping Over Arguments Passed to a Script” 
m Recipe 5.11, “Counting Arguments” 

= Recipe 5.12, “Consuming Arguments” 

= Recipe 5.18, “Changing Pieces of a String” 

= Recipe 5.20, “Using bash for basename” 

= Recipe 6.15, “Parsing Command-Line Arguments” 


= Recipe 13.1, “Parsing Arguments for Your Shell Script” 


13.3 Parsing Some HTML 


Problem 


You want to pull the strings out of some HTML. For example, you’d like to 


get at the href="urlstringstuff"-type strings from the <a> tags within a 
chunk of HTML. 


Solution 
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For a quick and easy shell parse of HTML, provided it doesn’t have to be 
foolproof, you might want to try something like this: 


cat $1 | sed -e 's/>/>\ 
/g' | grep '<a' | while IFS='"' read a bc ; do echo $b; done 


Discussion 


Parsing HTML from bash is pretty tricky, mostly because bash tends to be 
very line-oriented whereas HTML was designed to treat newlines like 
whitespace. So, it’s not uncommon to see tags split across two or more lines, 
as in: 


<a href="blah..." rel="blah..." media="blah..." 
target= "blah..." > 


There are also two ways to write <a> tags, one with a separate ending </a> 
tag and one without, where instead the singular <a> tag itself ends with a />. 
Between this and the potential for multiple tags on a line and tags split across 
lines, it’s a bit messy to parse, and our simple bash technique for this is often 
not foolproof. 


Here are the steps involved in our solution. First, break the multiple tags on 
one line into at most one line per tag: 


cat file | sed -e 's/>/>\ 
/g' 


Yes, that’s a newline right after the backslash so that it substitutes each end- 
of-tag character (1.e., the >) with that same character and then a newline. That 
will put tags on separate lines, with maybe a few extra blank lines. The 
trailing g tells sed to do the search and replace globally; i.e., multiple times 
on a line if need be. 


Then you can pipe that output into grep to grab just the <a tag lines: 
cat file | sed -e 's/>/>\ 
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/g' | grep '<a' 
or maybe just lines with double quotes: 


cat file | sed -e 's/>/>\ 
19" l grep an en 


The single quotes tell the shell to take the inner characters literally and not do 
any shell expansion on them, and the rest is a regular expression to match a 
double quote followed by any character (.) any number of times (*), 
followed by another double quote. (This won’t work if the string itself is split 
across lines.) 


To parse out the contents of what’s inside the double quotes, one trick is to 
use the shell’s internal field separator ($IFS) to tell it to use the double quote 
(")as the separator. You can do a similar thing with awk and its -F (field 
separator) option. 


For example: 


cat $1 | sed -e 's/>/>\ 
/g' l grep S l awk sF"? '{ print SZJ" 


(Or use grep '<a' if you just want <a tags and not all quoted strings.) 


If you want to use the $IFS shell trick rather than awk, it would be: 


cat $1 | sed -e 's/>/>\ 
/g' | grep '<a' | while IFS='"' read PRE URL POST ; do echo SURL; done 


where the grep output is piped into a while loop that reads the input into 
three fields (PRE, URL, and POST). By preceding the read command with 
IFS='"', we set that environment variable just for the read command, not for 
the entire script. Thus, it will parse with the quotes as its notion of what 
separates the words of the input line. It will set PRE to be everything up to the 
first quote, URL to be everything from there to the next quote, and POST to be 
everything thereafter. Then the script just echoes the second variable, URL; 
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that’s all the characters between the quotes. 


See Also 


m man sed 


m man grep 


13.4 Parsing Output into an Array 


Problem 


You want the output of some program or script to be put into an array. 


Solution 
Example 13-3 illustrates how to use an array to parse the output into words. 


Example 13-3. ch13/parseViaArray 


#!/usr/bin/env bash 

# cookbook filename: parseViaArray 

# 

# find the file size 

# use an array to parse the ls -l output into words 


LSL=$(ls -ld $1) 


declare -a MYRA 
MYRA=(SLSL) 


echo the file $1 is S{MYRA[4]} bytes. 


Discussion 


In our example, we take the output from the Ls -l command and parse the 
words by putting them into an array. Then we can just refer to each array 
element to get at each word. (Remember that the arrays are zero-based, so an 
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index of 4 gives us the fifth element.) The typical output from the ls -l 
command looks like this (yours may vary due to locale): 


-fw-r--r-- 1 albing users 113 2006-10-10 23:33 mystuff.txt 


Arrays are easy to initialize if you know the values as you write the script. 
The format is simple. We begin by declaring the variable to be an array, and 
then we assign it values: 


declare -a MYRA 
MYRA=(first second third home) 


The same can be done by using a variable inside those parentheses. Just be 
sure not to use quotes around the variable. Writing MYRA=$("$LSL") will put 
the entire string into the first argument, since it is all contained as one quoted 


string. Then ${MYRA[0]} will be the only array element, and it will contain 
the entire string, which is not what you wanted. 


We also could have shortened this script by combining the steps like this: 


declare -a MYRA 
MYRA=($(1s -ld $1)) 


If you want to know how many elements you have in your new array, just 
reference the variable ${#MYRA[* ]} or ${#MYRA[@]}, either of which is a lot 
of special characters to type. 


See Also 
= Recipe 5.23, “Using Array Variables” 


13.5 Parsing Output with a Function Call 


Problem 
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You want to parse the output of some program into various variables to be 
used else-where in your program. Arrays are great when you are looping 
through the values, but not very readable if you want to refer to each 
separately, rather than by an index. 


Solution 
Use a function call to parse the words, as shown in Example 13-4. 


Example 13-4. ch13/parseViaFunc 


#!/usr/bin/env bash 

# cookbook filename: parseViaFunc 

# 

# parse ls -l via function call 

# an example of output from ls -l follows: 

# -rw-r--r-- 1 albing users 126 Jun 10 22:50 fnsize 


function lsparts () 

{ 
PERMS=$1 
LCOUNT=$2 
OWNER=$3 
GROUP=$4 
SIZE=$5 
CRMONTH=$6 
CRDAY=$7 
CRTIME=$8 
FILE=$9 

} 


lsparts $(ls -l "$1") 
echo $FILE has $LCOUNT 'Link(s)' and is $SIZE bytes long. 


Here’s what it looks like when it runs: 


$ ./fnsize fnsize 
fnsize has 1 Link(s) and is 311 bytes long. 
$ 
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Discussion 


We can let bash do the work of parsing by putting the text to be parsed in a 
function call. Calling a function is much like calling a shell script. bash 
parses the words into separate variables and assigns them to $1, $2, etc. Our 
function can just assign each positional parameter to a separate variable. If 
the variables are not declared locally then they are available outside as well 
as inside the function. 


We put quotes around the reference to $1 in the ¿s command in case the 
filename supplied has spaces in its name. The quotes keep it all together so 
that /s sees it as a single filename and not as a series of separate filenames. 


We use quotes in the expression 'Link(s)' to avoid special treatment of the 
parentheses by bash. Alternatively, we could have put the entire phrase 
(except for the echo itself inside of double quotes—double, not single, so 
that the variable substitution (for $FILE, etc. still occurs. 


WARNING 
You might need to adjust the field list depending on how your computer and ls | 
command present the date, or add options to your /s command to modify its 
output. For example, ls -l --time-style="Long-iso" will produce a 
slightly different output format where month and day are replaced by a YYYY- 
MM-DD format date; you would then want to replace CRMONTH and CRDAY with a 
single variable, say, CRDATE, and adjust the field numbers accordingly. | 


See Also 

= Recipe 10.4, “Defining Functions” 

= Recipe 10.5, “Using Functions: Parameters and Return Values” 
= Recipe 13.9, “Getting Your Plurals Right” 

= Recipe 17.7, “Clearing the Screen When You Log Out” 
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13.6 Parsing Text with a read Statement 


Problem 


There are many ways to parse text with bash. What if you don’t want to use a 
function? Is there another way? 


Solution 


Use the read statement, as in Example 13-5. 


Example 13-5. ch13/parseViaRead 


#!/usr/bin/env bash 

# cookbook filename: parseViaRead 

# 

# parse ls -l with a read statement 

# an example of output from ls -l follows: 

# -rw-r--r-- 1 albing users 126 2006-10-10 22:50 fnsize 


ls -l "$1" | { read PERMS LCOUNT OWNER GROUP SIZE CRDATE CRTIME FILE ; 
echo $FILE has $LCOUNT 'Link(s)' and is $SIZE bytes 
long. ; 


Discussion 


Here we let read do all the parsing. It will break apart the input into words, 
where words are separated by whitespace, and assign each word to the 
variables named in the read statement. Actually, you can even change the 
separator, by setting the bash SIFS (internal field separator) variable to 
whatever character you want for parsing; just remember to set it back! 


As you can see from the sample output of ls -l, we have tried to choose 
names that get at the meaning of each word in the output. Since FILE is the 
last word, any extra fields will also be part of that variable. That way if the 
name has whitespace in it, like “Beethoven Fifth Symphony,” then all three 
words will end up in $FILE. 
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See Also 


= Recipe 2.14, “Saving or Grouping Output from Several Commands” 


= Recipe 19.8, “Forgetting that Pipelines Make Subshells” 


13.7 Parsing with read into an Array 


Problem 


You’ve got a varying number of words on each line of input, so you can’t just 
assign each word to a predetermined variable. 


Solution 


Use the -a option on the read command, and the words will be read into an 
array variable: 


read -a MYRAY 


Discussion 


Whether coming from user input or a pipeline, read will parse the input into 
words, putting each word in its own array element. The variable does not 
need to be declared as an array—using it in this fashion is enough to make it 
into an array. Each element can be referenced with the bash array syntax. 
Arrays in bash are zero-based, so the second word on a line of input will be 
put into ${MYRAY[1]} in our example. The number of words will determine 
the size of the array. In our example, the size of the array is ${#MYRAY[@] }. 


See Also 
= Recipe 3.5, “Getting User Input” 


= Recipe 13.6, “Parsing Text with a read Statement” 
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13.8 Reading an Entire File 


Problem 


You want to read in a whole file and then parse it. Must you do this using a 
for loop and reading one line at a time, or is there a shorthand? 


Solution 


Use the mapfile or readarray command in bash. They are identical 
commands that take the same arguments and let you read an entire file into an 
array, one array entry for each line of the file, with one statement. 


The choice of command, either readarray or mapfile, seems to be one of 
perspective—are you thinking about the destination (the array), or the source 
(the datafile)? Use whichever makes more sense to you. They are 
interchangeable. 


Here’s a sample mapfile command, part of a fuller example in the discussion: 
mapfile -t -s 1 -n 1500 -C showprg -c 100 BIGDATA < /tmp/myfile.data 


This command will discard (1.e., skip) the first line (-s 1) of input, reading 
up to 1,500 lines (-n 1500) and discarding the newline at the end of each line 
(-t). Every 100 lines (-c 100) it will call a user-defined function called 
showprg (to show progress in reading the file; the default is every 5,000 
lines). The data is put into the array called BIGDATA, one line of input per 
entry. Input is redirected from the file as shown. 


Discussion 


Here’s the first part of an example use of mapfile (or readarray, if you 
prefer). It reads the file and shows progress as it reads. Then it prints out how 
many lines it read—1.e., the size of the array: 


# use mapfile to read in $1 
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# show progress with dots 
function showprg () 


{ 
} 


printf "." 


# create a large datafile for our use 
ls -l /usr/bin > /tmp/myfile.data 


# a.k.a. readarray; load up BIGDATA 
mapfile -t -s 1 -n 1500 -C showprg -c 100 BIGDATA < /tmp/myfile.data 


# put a newline at the end of the showprg output 
echo 


# how many lines did we read? 
siz=${#BIGDATA[@]} 
echo "size: ${siz}" 


The showprg function will print a dot (but no newline) each time it is called. 
This will show progress when reading in a large file. You could do something 
much fancier if you wanted; it’s whatever function you want, after all. 


So now that the file has been read into the array, what might we do with all 
that data? In this case it’s a very long output from the /s command. We could 
now go through the file one line at a time and print out some of the data: 


# number the lines as we print them out 
for((i=0; i<siz; i++)) 
do 
ALINE=${BIGDATA[i]} 
if [[ S{ALINE:0:1} == 'L' ]]  # only symbolic links 
then 
# print the relevant substring 
printf "%4d: %s\n" Si "S{ALINE:48}" 
fi 
done 


rm /tmp/myfile.data # clean up 
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In this case, the script will look at the first character of a line and, if it’s an l, 
print out the line, beginning at character 48 (zero-based. Since the data in the 
file is the “long” output of an /s command, such a first character indicates that 
we are looking at a symbolic link. (Similarly, a d would indicate a directory, 
but we don’t use that here. 


Here is an excerpt of the output that might result from running the whole 
script. The first line shows the dots that appear as the progress of the read: 


Size: 1500 

(other output, and then) 

1307: rsh -> /etc/alternatives/rsh 
1311: rtstat -> Instat 

1315: rview -> /etc/alternatives/rview 
(even more output) 


See Also 


m Recipe 13.7, “Parsing with read into an Array” 


13.9 Getting Your Plurals Right 


Problem 


You want to use a plural noun when you have more than one of an object. 
But you don’t want to scatter if statements all through your code. 


Solution 
Example 13-6 illustrates a way to make words plural. 


Example 13-6. ch13/pluralize 


#!/usr/bin/env bash 

# cookbook filename: pluralize 

# 

# A function to make words plural by adding an s 
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# when the value ($2) is != 1 or -1. 

# It only adds an 's'; it is not very smart. 
# 

function plural () 


{ 
if [ $2 -eq 1 -o $2 -eq -1 | 
then 
echo ${1} 
else 
echo ${1}s 
fi 
} 


while read num name 
do 

echo $num $(plural "$name" $num) 
done 


Discussion 


The function, though only set to handle the simple addition of an s, will do 
fine for many nouns. The function doesn’t do any error checking of the 
number or contents of the arguments. If you wanted to use this script in a 
serious application, you might want to add those kinds of checks. 


We put the name in quotes when we call the plural function in case there are 
embedded blanks in the name. It did, after all, come from the read statement, 
and the last variable in a read statement gets all the remaining text from the 
input line. You can see that in the following example. 


We put the script in Example 13-6 into a file named pluralize and ran it 
against the following data: 


cat input.file 

hen 

duck 

squawking goose 
limerick oyster 
corpulent porpoise 


WBBWN PY 
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$ ./pluralize < input.file 
1 hen 

2 ducks 

3 squawking gooses 

4 limerick oysters 

5 corpulent porpoises 


“Gooses” isn’t correct English, but the script did what was intended. If you 
like the C-like syntax better, you could write the if statement like this: 


if (( $2 == 1 || $2 == -1 )) 


The square bracket (1.e., the fest builtin) is the older form, more common 
across the various versions of bash, but either should work. Use whichever 
form’s syntax is easiest for you to remember. 


We don’t expect you would keep a file like pluralize around, but the plural 
function might be handy to have as part of a larger scripting project. Then 
whenever you report on the count of something you could use the plural 
function as part of the reference, as shown in the while loop in the script. 


See Also 
= Recipe 6.11, “Looping with a read” 


13. 10 Taking It One Character at a Time 


Problem 


You have some parsing to do, and for whatever reason nothing else will do— 
you need to take your strings apart one character at a time. 


Solution 


The substring function for variables will let you take things apart, and another 
feature tells you how long a string is. Example 13-7 demonstrates their use. 
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Example 13-7. ch13/onebyone 


#!/usr/bin/env bash 
# cookbook filename: onebyone 
# 


# parsing input one character at a time 


while read ALINE 


do 
for ((i=0; i < S{#ALINE}; i++)) 
do 
ACHAR=S{ALINE: 1:1} 
# do something here, e.g. echo SACHAR 
echo SACHAR 
done 
done 
Discussion 


The read statement will take input from standard input and put it, a line at a 
time, into the variable SALINE. Since there are no other variables in the read 
statement, it takes the entire line and doesn’t divvy it up, but it will remove 
leading and trailing $IFS whitespace unless you use IFS= read or just read 
by itself and later reference the default SREPLY variable. 


The for loop will loop once for each character in the SALINE variable. We 
can compute how many times to loop by using ${#ALINE}, which returns the 
length of the contents of SALINE. 


Each time through the loop we assign SACHAR the value of the one-character 
substring of SALINE that begins at the ith position. That’s simple enough. 


See Also 

= Recipe 13.1, “Parsing Arguments for Your Shell Script” 
= Recipe 13.4, “Parsing Output into an Array” 

= Recipe 13.5, “Parsing Output with a Function Call” 


m Recipe 13.6, “Parsing Text with a read Statement” 
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= Recipe 13.7, “Parsing with read into an Array” 


13. 11 Cleaning Up an SVN Source Tree 


Problem 


Subversion’s svn status command shows all the files that have been 
modified, but if you have scratch files or other garbage lying around in your 
source tree, svn will list those, too. It would be useful to have a way to clean 
up your source tree, removing those files unknown to Subversion. 


WARNING 


Subversion won’t know about new files unless and until you do an svn add 
command. Don’t run this script until you’ve added any new source files, or 
they’ll be gone for good. 


Solution 


You can grep output from the svn status command and read that to create 
a list of files to delete: 


svn status src | grep '“\?' | \ 
while read status filename; do echo "Sfilename"; rm -rf "Sfilename"; 
done 


Discussion 


The svn status output lists one file per line. It puts an M as the first 
character of a line for files that have been modified, an A for newly added 
(but not yet committed) files, and a question mark for those about which it 
knows nothing. We just grep for those lines beginning with a question mark. 
We process the output with a read statement in a while loop. The echo isn’t 
strictly necessary, but it’s useful to see what’s being removed, just in case 
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there is a mistake or an error. You can at least see that it’s gone for good. 
When we do the remove, we use the -rf options in case the file is a 
directory, but mostly just to keep the remove quiet. Problems encountered 
with permissions and such are squelched by the -f option; it just removes the 
file as best as your permissions allow. We put the reference to the filename in 
quotes ("$fn" in case there are special characters, like spaces, in the 
filename. 


See Also 


= Recipe 6.11, “Looping with a read” 
=» Appendix D 


13.12 Setting Up a Database with MySQL 


Problem 


You want to create and initialize several databases using MySQL. You want 
them all to be initialized using the same SQL commands. Each database 
needs its own name, but each database will have the same contents, at least at 
initialization. You may need to do this setup over and over, as in the case 
where these databases are used as part of a test suite that needs to be reset 
when tests are rerun. 


Solution 


The simple bash script in Example 13-8 can help with this administrative 
task. 


Example 13-8. ch13/dbiniter 


#!/usr/bin/env bash 

# cookbook filename: dbiniter 

# 

# initialize databases from a standard file 
# creating databases as needed 
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DBLIST=S$(mysql -e "SHOW DATABASES;" | tail -n +2) 
select DB in SDBLIST "new..." 
do 
if [[ $DB == "new..." ]] 
then 
printf "%b" "name for new db: 
read DB rest 
echo creating new database $DB 
mysql -e "CREATE DATABASE IF NOT EXISTS $DB;" 


fi 


if [ -n "SDB" ] 
then 
echo Initializing database: $DB 
mysql $DB < ourInit.sql 
fi 
done 


Discussion 


The tail -n +2 is added to remove the heading from the list of databases 
(see Recipe 2.12). 


The select creates the menus showing the existing databases. We added the 
literal "new..." as an additional choice (see Recipe 3.7 and Recipe 6.16). 


When the user wants to create a new database, we prompt for and read a new 
name, but we use two fields in the read statement as a bit of error handling. 
If the user types more than one name on the line, we only use the first name 
—it gets put into the variable $DB while the rest of the input is put into $rest 
and ignored. (We could add an error check to see if $rest is null.) 

Whether created anew or chosen from the list of extant databases, if the $DB 
variable is not empty, it will invoke mysql one more time to feed it the set of 
SQL statements that we’ve put into the file our/nit.sq/ as our standardized 
initialization sequence. 

If you’re going to use a script like this, you might need to add parameters to 
your mysql command, such as -u and -p to prompt for a username and 
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password. It will depend on how your database and its permissions are 
configured, or whether you have a file named .my.cnf with your MySQL 
defaults. 


We could also have added an error check after the creation of the new 
database to see if it succeeded; if it did not succeed, we could unset DB, 
thereby bypassing the initialization. However, as many a math textbook has 
said, “we leave that as an exercise for the reader.” 


See Also 


= Recipe 2.12, “Skipping a Header in a File” 

= Recipe 3.7, “Selecting from a List of Options” 
= Recipe 6.16, “Creating Simple Menus” 

= Recipe 14.20, “Using Passwords in Scripts” 


13. 13 Isolating Specific Fields in Data 


Problem 


You need to extract one or more fields from each line of output. 


Solution 


Use cut if there are delimiters you can easily pick out, even if they are 
different for the beginning and end of the field you need: 


# Here's an easy one - what users, home directories and shells do 
# we have on this NetBSD system? 

$ cut -d':' -f1,6,7 /etc/passwd 

root: /root:/bin/csh 

toor:/root:/bin/sh 

daemon: /:/sbin/nologin 

operator: /usr/guest/operator: /sbin/nologin 

bin: /:/sbin/nologin 
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games: /usr/games: /sbin/noLlogin 

postfix: /var/spool/postfix: /sbin/nologin 

named: /var/chroot/named: /sbin/nologin 

ntpd: /var/chroot/ntpd: /sbin/nologin 

sshd: /var/chroot/sshd:/sbin/nologin 

smmsp: /nonexistent: /sbin/nologin 

uucp: /var/spool/uucppublic: /usr/libexec/uucp/uucico 
nobody: /nonexistent: /sbin/nologin 

jp: /home/ jp: /usr/pkg/bin/bash 


# What is the most popular shell on the system? 

$ cut -d':' -f7 /etc/passwd | sort | uniq -c | sort -rn 
10 /sbin/nologin 

2 /usr/pkg/bin/bash 

1 /bin/csh 

1 /bin/sh 

1 /usr/libexec/uucp/uucico 


# Now let's see the first two directory levels 

$ cut -d':' -f6 /etc/passwd | cut -d'/' -f1-3 | sort -u 
/ 

/home/ jp 

/nonexistent 

/root 

/usr/games 

/usr/guest 

/var/chroot 

/var/spool 


Use awk to split on multiples of whitespace, or if you need to rearrange the 
order of the output fields. Note the — denotes a tab character in the output. 
The default is a space, but you can change that using SOFS: 


# Users, home directories, and shells, but swap the last two 

# and use a tab delimiter 

$ awk "BEGIN {FS=":"; OFS="\t"; } { print $1,$7,$6; }' /etc/passwd 
root > /bin/csh > /root 

toor > /bin/sh > /root 

daemon > /sbin/nologin > / 

operator > /sbin/nologin > /usr/guest/operator 

bin > /sbin/nologin > / 
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games > /sbin/nologin > /usr/games 

postfix > /sbin/nologin > /var/spool/postfix 

named > /sbin/nologin > /var/chroot/named 

ntpd > /sbin/nologin > /var/chroot/ntpd 

sshd > /sbin/nologin > /var/chroot/sshd 

smmsp > /sbin/nologin > /nonexistent 

uucp > /usr/libexec/uucp/uucico > /var/spool/uucppublic 
nobody > /sbin/nologin > /nonexistent 

jp > /usr/pkg/bin/bash > /home/jp 


# Multiples of whitespace and swapped, first field removed 
$ grep '^# [1-9]' /etc/hosts | awk '{print $3,$2}' 
10.255.255.255 10.0.0.0 

172.31.255.255 172.16.0.0 

192.168.255.255 192.168.0.0 


Use grep -o to display just the part that matched your pattern. This is 
particularly handy when you can’t express delimiters in a way that lends 
itself to the solutions shown here. For example, say you need to extract all IP 
addresses from a file, no matter where they are. Note we use egrep because of 
the regular expression (regex), but -o should work with whichever GNU grep 
flavor you use (it is probably not supported on non-GNU versions; check 
your documentation): 


$ cat has_ipas 

This is line 1 with 1 IPA: 10.10.10.10 

Line 2 has 2; they are 10.10.10.11 and 10.10.10.12. 
Line three is ftp_server=10.10.10.13:21. 


$ egrep -o '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' has_ipas 
10.10.10.10 
10.10.10.11 
10.10.10.12 
10.10.10.13 


Discussion 


The possibilities are endless, and we haven’t even scratched the surface here. 
This is the very essence of what the Unix toolchain idea is all about: take a 
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number of small tools that do one thing well and combine them as needed to 
solve problems. 


Also, the regex we used for IP addresses is naive and could match other 
things, including invalid addresses. For a much better pattern, use the Perl 
Compatible Regular Expressions (PCRE regex from Mastering Regular 
Expressions, 3rd Edition, by Jeffrey E. F. Friedl (O’ Reilly, if your grep 
supports -P: 


$ grep -oP '([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?]2[0-4]\d|25[0- 
5])\. 

([01]?\d\ 

d?|2[0-4]\d]25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])' has_ipas 
10.10.10.10 

10.10.10.11 

10.10.10.12 

10.10.10.13 

$ 


Or use Perl: 


$ perl -ne ‘while ( m/([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0- 
4]\d|25[0-5])\. 
([01]?\d\d?|2[0-4]\d]25[0-5])\.([01]?\d\d?]2[0-4]\d]25[0-5])/g ) { 
print qq($1.$2.$3. 

$4\n); }' has_ipas 

10.10.10.10 

10.10.10.11 

10.10.10.12 

10.10.10.13 

$ 


See Also 


man cut 
man awk 
man grep 


Mastering Regular Expressions, 3rd Edition, by Jeffrey E. F. Friedl 
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(O’Reilly) 
= Recipe 8.4, “Cutting Out Parts of Your Output” 
= Recipe 13.15, “Trimming Whitespace” 
= Recipe 15.10, “Finding My IP Address” 


= Recipe 17.16, “Finding Lines That Appear in One File but Not in 
Another” 


13. 14 Updating Specific Fields in Datafiles 


Problem 


You need to extract certain parts (fields) of a line (record) and update them. 


Solution 


In the simple case, you want to extract a single field from a line, then perform 
some operation on it. For that, you can use cut or awk. See Recipe 13.13 for 
details. 


For the more complicated case, you need to modify a field in a datafile 
without extracting it. If it’s a simple search and replace, use sed. 


For example, let’s switch everyone from csh to sh on this NetBSD system: 


$ grep csh /etc/passwd 
root: *:0:0:Charlie &:/root:/bin/csh 


$ sed 's;/csh$;/sh;' /etc/passwd | grep '“root' 
root:*:0:0:Charlie &:/root:/bin/sh 


You can use awk if you need to do arithmetic on a field or modify a string 
only in a certain field: 


$ cat data_file 
Line 1 ends 
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Line 2 ends 
Line 3 ends 
Line 4 ends 
Line 5 ends 


$ awk '{print $1, $2+5, $3}' data_file 
Line 6 ends 
Line 7 ends 
Line 8 ends 
Line 9 ends 
Line 10 ends 


# If the second field contains '3', change it to '8' and mark it 

$ awk '{ if ($2 == "3") print $1, $2+5, $3, "Tweaked" ; else print $0; 
yN 

data_file 

Line 1 ends 

Line 2 ends 
Line 8 ends Tweaked 
Line 4 ends 
Line 5 ends 


Discussion 


The possibilities here are as endless as your data, but hopefully these 
examples will give you enough of a start to easily modify your data. 


See Also 


m man awk 

m man sed 

a /ttp://sed.sourceforge.net/sedfaq.html 

= /Attp://sed.sourceforge.net/sed line. txt 

= Recipe 11.7, “Figuring Out Date and Time Arithmetic” 
= Recipe 13.13, “Isolating Specific Fields in Data” 
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13.15 Trimming Whitespace 


Problem 


You need to trim leading and/or trailing whitespace from lines for fields of 
data. 


Solution 


These solutions rely on a bash-specific treatment of read and $REPLY. See the 
end of the discussion for an alternate solution. 


First, we'll show a file with some leading and trailing whitespace. Note that 
we add ~~ to show the whitespace, and that the — denotes a literal tab 
character in the output: 


# Show the whitespace in our sample file 

$ while read; do echo ~~"SREPLY"~~; done < whitespace 
~~ This line has leading spaces. ~~ 

~-This line has trailing spaces. ~~ 

~~ This line has both leading and trailing spaces. ~~ 
~~ > Leading tab.~~ 

~-Trailing tab. > ~~ 

~~ > Leading and trailing tab. > ~~ 

~~ > Leading mixed whitespace. ~~ 

~~-Trailing mixed whitespace. > ~~ 

~~ » Leading and trailing mixed whitespace. > ~~ 


$ 


To trim both leading and trailing whitespace, use $IFS and the builtin REPLY 
variable (see the discussion for why this works): 


S while read REPLY; do echo ~~"SREPLY"~~; done < whitespace 
~-This line has leading spaces.~~ 

~-This line has trailing spaces.~~ 

~-This line has both leading and trailing spaces. ~~ 
~~Leading tab.~~ 

~~-Trailing tab.~~ 

~~Leading and trailing tab.~~ 
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~~Leading mixed whitespace. ~~ 
~~-Trailing mixed whitespace. ~~ 
~~Leading and trailing mixed whitespace. ~~ 


$ 
To trim only leading or only trailing spaces, use a simple pattern match: 


# Leading spaces only 

$ while read; do echo "~~S{REPLY## }~~"; done < whitespace 
~-This line has leading spaces.~~ 

~-This line has trailing spaces. ~~ 

~-This line has both leading and trailing spaces. ~~ 
~~ > Leading tab.~~ 

~-Trailing tab. ~~ 

~~ » Leading and trailing tab. > ~~ 

~~ » Leading mixed whitespace. ~~ 

~~-Trailing mixed whitespace. > ~~ 

~~ » Leading and trailing mixed whitespace. > ~~ 


# Trailing spaces only 

$ while read; do echo "~~${REPLY%% }~~"; done < whitespace 
~~ This line has leading spaces.~~ 

~-This line has trailing spaces. ~~ 

~~ This line has both leading and trailing spaces.~~ 

~~ > Leading tab.~~ 

~-Trailing tab. ~~ 

~~ > Leading and trailing tab. > ~~ 

~~ > Leading mixed whitespace.~~ 

~~-Trailing mixed whitespace. > ~~ 

~~ > Leading and trailing mixed whitespace. > ~~ 


Trimming only leading or only trailing whitespace (including tabs) is a bit 
more complicated: 


# You need this either way 
$ shopt -s extglob 


# Leading whitespace only 

$ while read; do echo "~~${REPLY##+([[:space:]])}~~"; done < whitespace 
~-This line has leading spaces.~~ 

~-This line has trailing spaces. ~~ 
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~-This line has both leading and trailing spaces. ~~ 
~~Leading tab.~~ 

~-Trailing tab. ~~ 

~~Leading and trailing tab. > ~~ 

~~Leading mixed whitespace.~~ 

~~Trailing mixed whitespace. > wx 

~~Leading and trailing mixed whitespace. > xx 


$ 


# Trailing whitespace only 

$ while read; do echo "~~S{REPLY%%+([[:space:]])}~~"; done < whitespace 
~~ This line has leading spaces.~~ 

~-This line has trailing spaces.~~ 

~~ This line has both leading and trailing spaces.~~ 
~~ > Leading tab.~~ 

~~-Trailing tab.~~ 

~~ » Leading and trailing tab.~~ 

~~ > Leading mixed whitespace.~~ 

~~Trailing mixed whitespace. ~~ 

~~ > Leading and trailing mixed whitespace. ~~ 


Discussion 


OK, at this point you are probably looking at these lines and wondering how 
we’re going to make this comprehensible. It turns out there’s a simple, if 
subtle, explanation. 


Here we go. The first example used the default SREPLY variable that read 
uses when you do not supply your own variable name(s). Chet Ramey 
(maintainer of bash) made a design decision to “[if] there are no variables, 
save the text of the line read to the variable SREPLY,” unchanged: 


while read; do echo ~~"SREPLY"~~; done < whitespace 


But when we supply one or more variable names to read, it does parse the 
input, using the values in $IFS (which are space, tab, and newline by default). 
One step of that parsing process is to trim leading and trailing whitespace— 
just what we want: 
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while read REPLY; do echo ~~"SREPLY"~~; done < whitespace 


Trimming leading or trailing spaces (but not both) is easy using the ${##} or 
${%%} operators (see Recipe 6.7): 


while read; do echo "~~S{REPLY## }~~"; done < whitespace 
while read; do echo "~~S{REPLY%% }~~"; done < whitespace 


Covering tabs is a little harder. If we had only tabs, we could use the ${##} 
or ${%%} operators and insert literal tabs using the Ctrl-V Ctrl-I key 
sequence. But that’s risky since it’s probable there’s a mix of spaces and tabs, 
and some text editors or unwary users may strip out the tabs. So, we turn on 
extended globbing and use a character class to make our intent clear. The 
[:space: ] character class would work without extgLob, but we need to say 
“one or more occurrences” using +() or else it will trim a single spaces or 
tabs, but not multiples of both on the same line. If you only care about space 
or tab, you could use the [: blank: ] character class instead, since [: space: ] 
includes other characters like the vertical tab (\v) and DOS CR (carriage 
return, \r): 


# This works, need extglob for +() part 
$ shopt -s extglob 


$ while read; do echo "~~${REPLY##+([[:space:]])}~~"; done < whitespace 


$ while read; do echo "~~${REPLY%%+([[:space:]])}~~"; done < whitespace 


# This doesn't 

$ while read; do echo "~~${REPLY##[[:space:]]}~~"; done < whitespace 
~-This line has leading spaces.~~ 

~-This line has trailing spaces. ~~ 

~-This line has both leading and trailing spaces. ~~ 

~~Leading tab.~~ 

~-Trailing tab. ~~ 

~~Leading and trailing tab. ~~ 

~~ > Leading mixed whitespace. ~~ 

~~Trailing mixed whitespace. > ~~ 
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~~ > Leading and trailing mixed whitespace. lated 


Here’s a different take, exploiting the same $IFS parsing, but to parse out 
fields (or words) instead of records (or lines): 


$ for i in $(cat white_space); do echo ~~$i~~; done 
~-This~~ 
~~LlLine~~ 
~~has~~ 
~~LlLeading~~ 
~-white~~ 
~~space.~~ 
~~This~~ 
~~LlLine~~ 
~~has~~ 
~~trailing~~ 
~-white~~ 
~~space.~~ 
~-This~~ 
~~Line~~ 
~~has~~ 
~~both~~ 
~~LlLeading~~ 
~~and~~ 
~~trailing~~ 
~-white~~ 
~~space.~~ 


$ 


Finally, although the original solutions rely on Chet’s design decision about 
read and SREPLY, this solution does not: 


shopt -s extglob 


while IFS= read -r line; do 

echo "None: ~~$line~~" # preserve all whitespaces 

echo "Ld: ~~S{line##+([[:space:]])}~~" # trim Leading whitespace 
echo "Tr: ~~${1line%%+([[:space:]])}~~" # trim trailing whitespace 
Line="${line##+([[:space:]])}" # trim leading and... 
Line="${linex%+([[:space:]])}" # ...trailing whitespace 
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echo "ALL: ~~Sline~~" # Show all trimmed 
done < whitespace 


See Also 
m Recipe 6.7, “Testing with Pattern Matches” 


= Recipe 13.6, “Parsing Text with a read Statement” 


13. 16 Compressing Whitespace 


Problem 


You have runs of whitespace in a file (perhaps it is fixed-length, space- 
padded) and you need to compress the spaces down to a single character or 
delimiter. 


Solution 


Use tr or awk as appropriate. 


Discussion 


If you are trying to compress runs of whitespace down to a single character, 
you can use tr, but be aware that you may damage the file if it is not well 
formed. For example, if fields are delimited by multiple whitespace 
characters but internally have spaces, compressing multiple spaces down to 
one space will remove that distinction. Imagine if the _ characters in the 
following example were spaces instead. Note the — denotes a literal tab 
character in the output: 


$ cat data_file 


Header1 Header2 Header3 

Reci_Field1 Reci_Field2 Rec1_Field3 
Rec2_Field1 Rec2_Field2 Rec2_Field3 
Rec3 Field1 Rec3_Field2 Rec3_ Field3 
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$ cat data_file | tr -s ' ' "\t' 
Header1 > Header2 > Header3 

Rec1_Field1 > Rec1_Field2 > Reci_Field3 
Rec2_Field1 > Rec2_Field2 > Rec2_Field3 
Rec3_Field1 > Rec3_Field2 > Rec3_Field3 


If your field delimiter is more than a single character, tr won’t work since it 
translates single characters from its first set into the matching single 
character in the second set. You can use awk to combine or convert field 
separators. awk’s internal field separator FS accepts regular expressions, so 
you can separate on pretty much anything. There is a handy trick to this as 
well: an assignment to any field causes awk to reassemble the record using 
the output field separator, OFS, so assigning field 1 to itself and then printing 
the record has the effect of translating FS to OFS without you having to worry 
about how many records there are in the data. 


In this example, multiple spaces delimit fields, but fields also have internal 
spaces, so the more simple case of: 


awk 'BEGIN {OFS="\t"} {$1=$1; print }' data_file1 
won’t work. Here is a datafile: 


$ cat data_file1 


Header1 Header2 Header3 
Rec1 Field1 Rec1 Field2 Rec1 Field3 
Rec2 Field1 Rec2 Field2 Rec2 Field3 
Rec3 Field1 Rec3 Field2 Rec3 Field 
$ 


In the next example, we assign two spaces to FS and the tab to OFS. We then 
make an assignment ($1 = $1) so awk rebuilds the record, but that results in 
strings of tabs replacing the double spaces, so we use gsub to squash the tabs, 
then we print. Note the — denotes a literal tab character in the output. The 
output is a little hard to read, so there is a hex dump as well. Recall that 
ASCII tab is 09 while ASCII space is 20: 
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$ awk 'BEGIN { FS = " "; OFS = "\t" } { $1 
print }' \ 

data_file1 

Header1 > Header2 > Header3 

Rec1 Fieldi > Rec1 Field2 > Rec1 Field3 
Rec2 Field1 > Rec2 Field2 > Rec2 Field3 
Rec3 Field1 > Rec3 Field2 > Rec3 Field3 


$1; gsub(/\t+ ?/, "\t"); 


$ awk 'BEGIN { FS = " "; OFS = "\t" } { $1 
print }' \ 

data_file1 | hexdump -C 

00000000 48 65 61 64 65 72 31 09 48 65 61 64 65 72 32 09 

| Header1.Header2. | 

00000010 48 65 61 64 65 72 33 Oa 52 65 63 31 20 46 69 65 |Header3.Rec1 
Fie| 

00000020 6c 64 31 09 52 65 63 31 20 46 69 65 6c 64 32 09 |ld1.Rec1 
Field2. | 

00000030 52 65 63 31 20 46 69 65 6c 64 33 Oa 52 65 63 32 |Rec1 
Field3.Rec2| 

00000040 20 46 69 65 6c 64 31 09 52 65 63 32 20 46 69 65 | Field1.Rec2 
Fie| 

00000050 6c 64 32 09 52 65 63 32 20 46 69 65 6c 64 33 Oa |1d2.Rec2 
Field3. | 

00000060 52 65 63 33 20 46 69 65 6c 64 31 09 52 65 63 33 |Rec3 
Field1.Rec3| 

00000070 20 46 69 65 6c 64 32 09 52 65 63 33 20 46 69 65 | Field2.Rec3 
Fie| 

00000080 6c 64 Oa | ld. | 
00000083 


$1; gsub(/\t+ ?/, "\t"); 


You can use awk to trim leading and trailing whitespace in the same way, but 
as noted previously, this will replace your field separators unless they are 
already spaces: 


awk '{ $1 = $1; print }' white_space 


See Also 
a Effective awk Programming, 4th Edition, by Arnold Robbins (O’Reilly) 
m sed & awk, 2nd Edition, by Arnold Robbins and Dale Dougherty 
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(O’ Reilly) 
= Recipe 13.17, “Processing Fixed-Length Records” 
= “tr Escape Sequences” in Appendix A 
a “Table of ASCH Values” in Appendix A 


13. 17 Processing Fixed-Length Records 


Problem 


You need to read and process data that is in a fixed-length (also called fixed- 
width) form. 


Solution 


Use Perl or gawk 2.13 or greater. Given a file like: 


$ cat fixed-length_file 


Header1----------- Header2------------------------- Header3--------- 
Rec1 Field1 Rec1 Field2 Rec1 Field3 
Rec2 Field1 Rec2 Field2 Rec2 Field3 
Rec3 Field1 Rec3 Field2 Rec3 Field3 


you can process it using GNU’s gawk, by setting FIELDWIDTHS to the correct 
field lengths, setting OFS as desired, and making an assignment so gawk 
rebuilds the record using this OFS. However, gawk does not remove the 
spaces used in padding the original record, so we use two gsubs to do that, 
one for all the internal fields and the other for the last field in each record. 
Finally, we just print. Note the — denotes a literal tab character in the output. 
The output is a little hard to read, so there is a hex dump as well. Recall that 
ASCII tab is 09 while ASCII space is 20: 


$ gawk 'BEGIN { FIELDWIDTHS = "18 32 16"; OFS = "\t" } 
> { $1 = $1; gsub(/ +\t/, "\t"); gsub(/ +$/, ""); print }' fixed- 
Length_file 
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Headeri1----------- > Header2------------------------- > Header3-------- 
Rec1 Field1 > Rec1 Field2 > Rec1 Field3 
Rec2 Field1 > Rec2 Field2 > Rec2 Field3 
Rec3 Field1 > Rec3 Field2 > Rec3 Field3 


$ gawk 'BEGIN { FIELDWIDTHS = "18 32 16"; OFS = "\t" } 

> { $1 = $1; gsub(/ +\t/, "\t"); gsub(/ +$/, ""); print }' fixed- 
length_file \ 

> | hexdump -C 

00000000 48 65 61 64 65 72 31 2d 2d 2d 2d 2d 2d 2d 2d 2d |Header1------ 
---| 

00000010 2d 2d 09 48 65 61 64 65 72 32 2d 2d 2d 2d 2d 2d |--.Header2--- 
---| 

00000020 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d |------------- 
---| 


00000030 2d 2d 2d 09 48 65 61 64 65 72 33 2d 2d 2d 2d 2d |---.Header3-- 
-== 

00000040 2d 2d 2d 2d Oa 52 65 63 31 20 46 69 65 6c 64 31 |----.Rec1 
Field1| 


00000050 09 52 65 63 31 20 46 69 65 6c 64 32 09 52 65 63 |.Rec1 
Field2.Rec | 

00000060 31 20 46 69 65 6c 64 33 Oa 52 65 63 32 20 46 69 |1 Field3.Rec2 
Fil 

00000070 65 6c 64 31 09 52 65 63 32 20 46 69 65 6c 64 32 |eld1.Rec2 
Field2| 

00000080 09 52 65 63 32 20 46 69 65 6c 64 33 Oa 52 65 63 |.Rec2 
Field3.Rec | 

00000090 33 20 46 69 65 6c 64 31 09 52 65 63 33 20 46 69 |3 Field1.Rec3 
Fil 

000000a0 65 6c 64 32 09 52 65 63 33 20 46 69 65 6c 64 33 |eld2.Rec3 
Field3| 

000000b0 Oa |.| 

000000b1 


If you don’t have gawk, you can use Perl, which is more straightforward 
anyway. We use a nonprinting while input Loop (-n), unpack each record 
($_) as it’s read, and turn the resulting list back into a scalar by joining the 
elements with a tab. We then print each record, adding a newline at the end: 


$ perl -ne 'print join("\t", unpack("A18 A32 A16", $_) ) . "\n"s;' \ 
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> fixed-length_file 

Header1----------- > Header2------------------------- > Header3-------- 
Rec1 Fieldi > Rec1 Field2 > Rec1 Field3 

Rec2 Field1 > Rec2 Field2 > Rec2 Field3 

Rec3 Field1 > Rec3 Field2 > Rec3 Field3 

$ perl -ne 'print join("\t", unpack("A18 A32 A16", $_) ) . "\n";' \ 

> fixed-length_file | 

> hexdump -C 

00000000 48 65 61 64 65 72 31 2d 2d 2d 2d 2d 2d 2d 2d 2d |Header1------ 


-== 
00000010 2d 2d 09 48 65 61 64 65 72 32 2d 2d 2d 2d 2d 2d |--.Header2--- 


-== 
00000020 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d |------------- 


00000030 2d 2d 2d 09 48 65 61 64 65 72 33 2d 2d 2d 2d 2d |---.Header3-- 
“=| 

00000040 2d 2d 2d 2d Oa 52 65 63 31 20 46 69 65 6c 64 31 |----.Rec1 
Field1| 


00000050 09 52 65 63 31 20 46 69 65 6c 64 32 09 52 65 63 |.Rec1 
Field2.Rec | 

00000060 31 20 46 69 65 6c 64 33 Oa 52 65 63 32 20 46 69 |1 Field3.Rec2 
Fil 

00000070 65 6c 64 31 09 52 65 63 32 20 46 69 65 6c 64 32 |eld1.Rec2 
Field2| 

00000080 09 52 65 63 32 20 46 69 65 6c 64 33 Oa 52 65 63 |.Rec2 
Field3.Rec| 

00000090 33 20 46 69 65 6c 64 31 09 52 65 63 33 20 46 69 |3 Field1.Rec3 
Fil 

000000a0 65 6c 64 32 09 52 65 63 33 20 46 69 65 6c 64 33 |eld2.Rec3 
Field3| 

000000b0 Oa |.| 

000000b1 


See the Perl documentation for the pack and unpack template formats. 


Discussion 


Anyone with any Unix background will automatically use some kind of 
delimiter in output, since the textutils toolchain is never far from mind, so 
fixed-length (a.k.a. fixed-width) records are rare in the Unix world. They are 
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very common in the mainframe world, however, so they will occasionally 
crop up in large applications that originated on big iron, such as some 
applications from SAP. As we’ve just seen, it’s no problem to handle them. 


One caveat to this recipe is that it requires each record to end in a newline. 
Many old mainframe record formats don’t, in which case you can use Recipe 
13.18 to add newlines to the end of each record before processing. 


See Also 


m man gawk 

€ /Attp://www.faqs.org/faqs/computer-lang/awk/faq/ 

= /ttp://perldoc.perl.org/functions/unpack.html 

= /Attp://perldoc.perl.org/functions/pack.html 

= Recipe 13.15, “Trimming Whitespace” 

m Recipe 13.18, “Processing Files with No Line Breaks” 


13. 18 Processing Files with No Line Breaks 


Problem 


You have a large file with no line breaks, and you need to process it. 


Solution 


Preprocess the file and add line breaks in appropriate places. For example, 
Open- Office’s OpenDocument Format (ODF) files are basically zipped 
XML files. It is possible to unzip them and grep the XML, which we did a lot 
while writing this book. See Recipe 12.5 for a more comprehensive treatment 
of ODF files. In this example, we insert a newline after every closing angle 
bracket (>). That makes it much easier to process the file using grep or other 
textutils. Note that we must enter a backslash followed immediately by the 
Enter key to embed an escaped newline in the sed script: 
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$ wc -l content.xml 
1 content.xml 


$ sed -e 's/>/>\ 
> /g' content.xml | wc -l 
1687 


If you have fixed-length records with no newlines, do this instead, where 48 
is the length of the record: 


$ cat fixed-length 

Line_1_ _aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazZzzZzLine_2_ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazZzZzLine 3_ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzzZzZLine_4 _ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzzZzLine_5_ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzzzZzLine_6_ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzzLine 7_ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazZzZLine_ 8 __ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzzLine_9_ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazZzzZLine_10_ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazZzzLine_11_ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzzLine_12_ 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazZzz 


$ wc -l fixed-length 
1 fixed-length 


$ sed 's/.\{48\}/&\ 

> /g;' fixed-length 

Line_1__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_2__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_3__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_4__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_5__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_6__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_7__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_8__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_9__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_10_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_11_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_12_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 


436 


$ perl -pe 's/(.{48})/$1\n/g;' fixed-length 

Line_1__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_2__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_3__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_4__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_5__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_6__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_7__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_8__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_9__aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_10_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_11_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 
Line_12_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZzZz 


Discussion 


This happens often when people create output programmatically, especially 
using canned modules and especially with HTML or XML output. 


Note the sed substitutions have an odd construct that allows an embedded 
newline. In sed, a literal ampersand (&) on the righthand side (RHS) of a 
substitution is replaced by the entire expression matched on the lefthand side 
(LHS), and the trailing \ on the first line escapes the newline so you don’t get 
an error like “sed: -e expression #1, char 11: unterminated ‘s’”. This is 
because sed doesn’t recognize \n as a metacharacter on the RHS of s///. 


See Also 


http://sed.sourceforge.net/sedfaq.html 


Effective awk Programming, 4th Edition, by Arnold Robbins (O’ Reilly) 
sed & awk, 2nd Edition, by Arnold Robbins and Dale Dougherty 


(O’ Reilly) 
Recipe 12.5, “Comparing Two Documents” 


Recipe 13.17, “Processing Fixed-Length Records” 
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13.19 Converting a Datafile to CSV 


Problem 


You have a data file that you need to convert to a Comma Separated Values 
(CSV) file. 


Solution 


Use awk to convert the data into CSV format: 


$ awk 'BEGIN { FS="\t"; OFS="\",\"" } { gsub(/"/, "\"\""); $1 = $1; 
> printf "\"%s\"\n", $0}' tab_delimited 

"Line 1","Field 2","Field 3","Field 4","Field 5 with ""internal"" 
double- quotes" 

"Line 2","Field 2","Field 3","Field 4","Field 5 with ""internal"" 
double- quotes" 

"Line 3","Field 2","Field 3","Field 4","Field 5 with ""internal"" 
double- quotes" 

"Line 4","Field 2","Field 3","Field 4","Field 5 with ""internal"" 
double- quotes" 


$ 
You can do the same thing in Perl also: 


$ perl -naF'\t' -e 'chomp @F; s/"/""/g for @F; print 
q(").join(q(","),@F) 

> .qq("\n);' tab_delimited 

"Line 1","Field 2","Field 3","Field 4","Field 5 with ""internal"" 
double- quotes" 

"Line 2","Field 2","Field 3","Field 4","Field 5 with ""internal"" 
double- quotes" 

"Line 3","Field 2","Field 3","Field 4","Field 5 with ""internal"" 
double-quotes" 

"Line 4","Field 2","Field 3","Field 4","Field 5 with ""internal"" 
double-quotes" 


$ 
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Discussion 


First of all, it’s tricky to define exactly what CSV really means. There is no 
formal specification, and various vendors have implemented various 
versions. Our version here is very simple, and should hopefully work just 
about anywhere. We place double quotes around all fields (some 
implementations only quote strings, or strings with internal commas, and we 
double internal double quotes. 

To do that, we have awk split up the input fields using a tab as the field 
separator and set the output field separator (OFS to ",", which will provide 
the trailing quote for each field and then the leading quote for the next field 
as well as the comma in between them. We then globally replace any double 
quotes with two double quotes, make an assignment so awk rebuilds the 
record with our specified OFS (see the awk trick in Recipe 13.15, and print 
out the record with leading and trailing double quotes. We have to escape 
double quotes in several places, which looks a little cluttered, but otherwise 
this is very straightforward. 


See Also 

a /Attp://www.faqs.org/fags/computer-lang/awk/faq/ 
m Recipe 13.15, “Trimming Whitespace” 

= Recipe 13.20, “Parsing a CSV Datafile” 


13.20 Parsing a CSV Datafile 


Problem 


You have a Comma Separated Values datafile that you need to parse. 


Solution 


Unlike the previous recipe for converting to CSV, there is no easy way to do 


439 


this, since it’s tricky to define exactly what CSV really means. 
Possible solutions for you to explore are: 

a sed: http://sed.sourceforge.net/sedfaq4.html#s4. 12. 

a awk: http://lorance.freeshell.org/csv/. 


= Perl: Mastering Regular Expressions, 3rd Edition, by Jeffrey E. F. Friedl 
(O’Reilly) has a regex to do this; see also CPAN, the Comprehensive Perl 
Archive Network, for various modules. 


= Load the CSV file into a spreadsheet (LibreOffice’s Calc and Microsoft’s 
Excel both work), then copy and paste the contents into a text editor; you 
should get tab-delimited output that you can now use easily. 


Discussion 


As noted in Recipe 13.19, there is no formal specification for CSV, and that 
fact, combined with data variations, makes this task much harder than it 
sounds. 


See Also 
= Recipe 13.19, “Converting a Datafile to CSV” 
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Chapter 14. Writing Secure 
Shell Scripts 


Writing secure shell scripts?! How can shell scripts be secure when you can 
read the source code? 


Any system that depends on concealing implementation details is attempting 
to use security by obscurity, and that is no security at all. Just ask the major 
software manufacturers whose source code is a closely guarded trade secret, 
yet whose products are incessantly vulnerable to exploits written by people 
who have never seen that source code. Contrast that with the code from 
OpenSSH and OpenBSD, which is totally open, yet very secure. 


Security by obscurity will never work for long, though some forms of it can 
be a useful additional layer of security. For example, having daemons 
assigned to listen on nonstandard port numbers will keep a lot of the so-called 
script-kiddies away. But security by obscurity must never be the only layer of 
security because sooner or later, someone is going to discover whatever 
you’ ve hidden. 


As Bruce Schneier says, security is a process. It’s not a product, object, or 
technique, and it is never finished. As technology, networks, attacks and 
defenses evolve, so must your security process. So what does it mean to write 
secure shell scripts? 


Secure shell scripts will reliably do what they are supposed to do, and only 
what they are supposed to do. They won’t lend themselves to being exploited 
to gain root access, they won’t accidentally rm -rf /, and they won’t leak 
information, such as passwords. They will be robust, but will fail gracefully. 
They will tolerate inadvertent user mistakes and sanitize all user input. They 
will be as simple as possible, and contain only clear, readable code and 
documentation so that the intention of each line is unambiguous. 


That sounds a lot like any well-designed, robust program, doesn’t it? Security 


441 


should be part of any good design process from the start—it shouldn’t be 
tacked on at the end. In this chapter we highlighted the most common 
security weaknesses and questions, and show you how to tackle them. 


A lot has been written about security over the years. If you’re interested, 
Practical UNIX & Internet Security, 3rd Edition, by Gene Spafford et al. 
(O’Reilly is a good place to start. Chapter 15 of Classic Shell Scripting by 
Nelson H. F. Beebe and Arnold Robbins (O’Reilly is another excellent 
resource. There are also many good online references, such as “A Lab 
engineer’s check list for writing secure Unix code”. 


The listing in Example 14-1 collects the most universal of the secure shell 
programming techniques, so they are all in one place as a quick reference 
when you need them or to copy into a script template. Be sure to read the full 
recipe for each technique so you understand it. 


Example 14-1. ch14/security_template 


#!/usr/bin/env bash 
# cookbook filename: security_template 


# Set a sane/secure path 

PATH='/usr/local/bin:/bin:/usr/bin' 

# It's almost certainly already marked for export, but make sure 
\export PATH 


# Clear all aliases. Important: leading \ inhibits alias expansion. 
\unalias -a 


# Clear the command path hash 
hash -r 


# Set the hard limit to 0 to turn off core dumps 
ulimit -H -c 0 -- 


# Set a sane/secure IFS (note this is bash & ksh93 syntax only--not 
portable! ) 
IFS=$" \t\n' 


# Set a sane/secure umask variable and use it 


# Note this does not affect files already redirected on the command line 
# 022 results in 0755 perms, 077 results in 0700 perms, etc. 


442 


UMASK=022 
umask SUMASK 


until [ -n "Stemp_dir" -a ! -d "Stemp_dir" ]; do 
temp_dir="/tmp/meaningful_prefix.${RANDOM}${RANDOM}${RANDOM}" 
done 
mkdir -p -m 0700 $temp_dir \ 
|| (echo "FATAL: Failed to create temp dir 'Stemp_dir': $?"; exit 100) 


# Do our best to clean up temp files no matter what 

# Note S$temp_dir must be set before this, and must not change! 
cleanup="rm -rf $temp_dir" 

trap "cleanup" ABRT EXIT HUP INT QUIT 


14. 1 Avoiding Common Security Problems 


Problem 


You want to avoid common security problems in your scripting. 


Solution 


Validate all external input, including interactive input and that from 
configuration files and interactive use. In particular, never eval input that 
you have not checked very thoroughly. 


Use secure temporary files, ideally in secure temporary directories. 


Make sure you are using trusted external executables. 


Discussion 


In a way, this recipe barely scratches the surface of scripting and system 
security. Yet it also covers the most common security problems you’|I find. 


Data validation, or rather the lack of it, is a huge deal in computer security 
right now. This is the problem that leads to buffer overflow and data injection 
attacks, which are by far the most common classes of exploit going around. 
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bash doesn’t suffer from this issue in the same way that C does, but the 
concepts are the same. In the bash world it’s more likely that unvalidated 
input will contain something like ;rm -rf / than a buffer overflow; 
however, neither is welcome. Validate your data! 


Race conditions are another big issue, closely tied to the problem of an 
attacker gaining an ability to write over unexpected files. A race condition 
exists when two or more separate events must occur in the correct order at the 
correct time without external interference. They often result in providing an 
unprivileged user with read and/or write access to files that user shouldn’t be 
able to access, which in turn can result in so-called privilege escalation, 
where an ordinary user can gain root access. Insecure use of temporary files 
is a very common factor in this kind of attack. Using secure temporary files, 
especially inside secure temporary directories, will eliminate this attack 
vector. 


Another common attack vector is Trojaned utilities. Like the Trojan horse, 
these appear to be one thing while they are in fact something else. The 
canonical example here is the Trojaned /s command that works just like the 
real /s command except when run by root. In that case it creates a new user 
called r00t, with a default password known to the attacker, and deletes itself. 
Using a secure $PATH is about the best you can do from the scripting side. 
From the systems side there are many tools, such as FCheck, Tripwire, and 
AIDE, to help you assure system integrity. 


See Also 


= /Attps://www.tripwire.com/ 


a /ttp://aide.sourceforge.net/ 


14.2 Avoiding Interpreter Spoofing 


Problem 


You want to avoid certain kinds of setuid root spoofing attacks. 
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Solution 


Pass a single trailing dash to the shell, as in: 


#!/bin/bash - 


Discussion 


The first line of a script is a magic line (often called the shebang line) that 
tells the kernel what interpreter to use to process the rest of the file. The 
kernel will also look for a single option to the specified interpreter. There are 
some attacks that take advantage of this fact, but if you pass an argument 
along, they are avoided. See section 4.7 of the Unix FAQs for details. 


However, hardcoding the path to bash may present a portability issue. See 
Recipe 15.1 for details. 


See Also 
= Recipe 14.15, “Writing setuid or setgid Scripts” 
= Recipe 15.1, “Finding bash Portably for #!” 


14.3 Setting a Secure $PATH 


Problem 


You want to make sure you are using a secure path. 


Solution 


Set $PATH to a known good state at the beginning of every script: 
# Set a sane/secure path 
PATH='/usr/local/bin: /bin: /usr/bin' 


# It's almost certainly already marked for export, but make sure 
export PATH 
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Or use the getconf utility to get a path guaranteed by POSIX to find all of the 
standard utilities: 


export PATH=$(getconf PATH) 


Discussion 


There are two portability problems with the second example. First, $() is less 
portable (if more readable) than ``. Second, having the export command on 
the same line as the variable assignment won’t always work across every old 
or odd version of Unix and Linux. var='foo'; export var is more portable 
than export var='foo'. Also note that the export command need only be 
used once to flag a variable to be exported to child processes. 


If you don’t use getconf, our example is a good default path for starters, 
though you may need to adjust it for your particular environment or needs. 
You might also use the less portable version: 


export PATH='/usr/local/bin: /bin: /usr/bin' 


Depending on your security risk and needs, you should also consider using 
absolute paths. This tends to be cumbersome and can be an issue where 
portability is concerned, as different operating systems put tools in different 
places. One way to mitigate these issues to some extent is to use variables. If 
you do this, sort them so you don’t end up with the same command three 
times because you missed it scanning the unsorted list. See Example 14-2 for 
an example. 


Example 14-2. ch14/finding_tools 


#!/usr/bin/env bash 
# cookbook filename: finding_tools 


# Export may or may not also be needed, depending on what you are doing 
# These are fairly safe bets 


_cp='/bin/cp' 
_mv='/bin/mv' 
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_rm='/bin/rm' 


# These are a Little trickier 
case $(/bin/uname) in 
"Linux' ) 
_cut='/bin/cut' 
_hice='/bin/nice' 


#[...] 


'Sun0OS' ) 
_cut='/usr/bin/cut' 
_hice='/usr/bin/nice' 


# [ees] 
* [veal 


esac 


One other advantage of this method is that it makes it very easy to see exactly 
what tools your script depends on, and you can even add a simple function to 
make sure that each tool is available and executable before your script really 
gets going. 


WARNING 


Be careful about the variable names you use. Some programs, like InfoZip, use 
environment variables such as $ZIP and SUNZIP to pass settings to the program 
itself, so if you do something like ZIP='/usr/bin/zip', you can spend days 
pulling your hair out wondering why it works fine from the command line, but 
not in your script. Trust us. We learned this one the hard way. Also RTFM. 


See Also 
= Recipe 6.14, “Branching Many Ways” 


Recipe 6.15, “Parsing Command-Line Arguments” 


Recipe 14.9, “Finding World-Writable Directories in Your $PATH” 


Recipe 14.10, “Adding the Current Directory to the $PATH” 
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= Recipe 15.2, “Setting a POSIX $PATH” 

= Recipe 16.4, “Changing Your $PATH Permanently” 

= Recipe 16.5, “Changing Your $PATH Temporarily” 

= Recipe 19.3, “Forgetting That the Current Directory Is Not in the $PATH” 
= “Builtin Commands” in Appendix A 


= “bash Reserved Words” in Appendix A 


14.4 Clearing All Aliases 


Problem 


You need to make sure that there are no malicious aliases in your 
environment for security reasons. 


Solution 


Use the \unalias -a command to unalias any existing aliases. 


Discussion 


If an attacker can trick root or even another user into running a command, 
they will be able to gain access to data or privileges they shouldn’t have. One 
way to trick another user into running a malicious program is to create an 
alias to some other common program (e.g., Ls). 


The leading \, which suppresses alias expansion, 1s very important because 
without it you can be tricked, like this: 


$ alias unalias=echo 
$ alias builtin=ls 


$ builtin unalias vi 
ls: unalias: No such file or directory 
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ls: vi: No such file or directory 


$ unalias -a 
-a 


WARNING 
As Chet says, “This is a tricky problem”: 


Since the shell finds shell functions before builtins in command search, and 
allows functions to be exported in the environment, it might be worth 
stressing to use builtin before every builtin, use command to skip function 
lookup, or just unset every function you might be interested in: 


_OLD=$POSIXLY_CORRECT; POSIXLY_CORRECT=1 
\unset -f builtin command unset 
POSIXLY_CORRECT=$ OLD ; \unset _OLD 
builtin unalias command builtin unset 
unset -f $(command declare -F \ 

| command sed 's/‘declare -f //') 


Or consider unsetting the expand_aliases option, in which case you have 
to do the unset/unalias dance for shopt as well. 


Chet Ramey 


See Also 


m Recipe 10.7, “Redefining Commands with alias” 
= Recipe 10.8, “Avoiding Aliases and Functions” 


= Recipe 16.8, “Shortening or Changing Command Names” 


14.5 Clearing the Command Hash 


Problem 


You need to make sure that your command hash has not been subverted. 
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Solution 


Use the hash -r command to clear entries from the command hash. 


Discussion 


On execution, bash “remembers” the location of most commands found in 
the $PATH to speed up subsequent invocations. 


If an attacker can trick root or even another user into running a command, 
they will be able to gain access to data or privileges they shouldn’t have. One 
way to trick another user into running a malicious program is to poison the 
hash so that the wrong program may be run. 


See Also 

= Recipe 14.9, “Finding World-Writable Directories in Your $PATH” 

= Recipe 14.10, “Adding the Current Directory to the $PATH” 

= Recipe 15.2, “Setting a POSIX $PATH” 

= Recipe 16.4, “Changing Your $PATH Permanently” 

= Recipe 16.5, “Changing Your $PATH Temporarily” 

= Recipe 19.3, “Forgetting That the Current Directory Is Not in the $PATH” 


14.6 Preventing Core Dumps 


Problem 


You want to prevent your script from dumping core in the case of an 
unrecoverable error, since core dumps may contain sensitive data from 
memory such as passwords. 


Solution 
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Use the bash builtin ulimit to set the core file size limit to 0, typically in your 
.bashrc file: 


ulimit -H -c © -- 


Discussion 


Core dumps are intended for debugging and contain an image of the memory 
used by the process at the time it failed. As such, the file will contain 
anything the process had stored in memory (e.g., user-entered passwords). 


Set this in a system-level file such as /etc/profile or /etc/bashrc to which users 
have no write access if you don’t want them to be able to change it. 


See Also 
m help ulimit 


14.7 Setting a Secure $IFS 


Problem 


You want to make sure your internal field separator environment variable is 
clean. 


Solution 


Set it to a known good state at the beginning of every script using this clear 
(but not POSIX-compliant) syntax: 


# Set a sane/secure IFS (note this is bash & ksh93 syntax only--not 
portable! ) 
IFS=$' \€\n' 


Discussion 
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As noted, this syntax is not portable. However, the canonical portable syntax 
is unreliable because it may easily be inadvertently stripped by editors that 
trim whitespace. The values are traditionally space, tab, newline—and the 
order is important. $*, which returns all positional parameters, the special 
S{!prefix@} and ${!prefix*} parameter expansions, and programmable 
completion all use the first value of $IFS as their separator. 


The typical method for writing leaves a trailing space and tab (indicated here 
by the dot and arrow on the first line: 


1 IFS=' * >] 
2 I 


Newline, space, tab is less likely to be trimmed, but that changes the default 
order, which may result in unexpected results from some commands: 


1 IFS=' 4 
2e. '!' 
See Also 


m Recipe 13.15, “Trimming Whitespace” 


14.8 Setting a Secure umask 


Problem 


You want to make sure you are using a secure umask. 


Solution 
Use the bash builtin umask to set a known good state at the beginning of 


every script: 


# Set a sane/secure umask variable and use it 
# Note this does not affect files already redirected on the command 
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line 

# 002 results in 0775 perms, 077 results in 0700 perms, etc. 
UMASK=002 

umask SUMASK 


Discussion 


We set the SUMASK variable in case we need to use different masks elsewhere 
in the program. You could just as easily skip it and do the following—it’s not 
a big deal: 


umask 002 


TIP 


Remember that umask is a mask that specifies the bits to be taken away from 
the default permissions of 777 for directories and 666 for files. When in doubt, 
test it out: 


# Run a new shell so you don't affect your current 
# environment 
/tmp$ bash 


# Check the current settings 
/tmp$ touch um_current 


# Check some other settings 

/tmpS umask 000 ; touch um_000 
/tmp$ umask 022 ; touch um_022 
/tmp$ umask 077 ; touch um_077 


/tmp$ ls -l um_* 
-rw-rw-rw- 1 jp jp © Jul 22 06:05 um000 


-fW-f--P-- 1 jp jp © Jul 22 06:05 um022 
-fW------- 1 jp jp © Jul 22 06:05 um077 
-rw-rw-r-- 1 jp jp © Jul 22 06:05 umcurrent 


# Clean up and exit the subshell 
/tmp$S rm um_* 
/tmp$ exit 
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See Also 


m help umask 


= /Attp://linuxzoo.net/page/sec_umask.html 


14.9 Finding World-Writable Directories in 
Your $PATH 


Problem 


You want to make sure that there are no world-writable directories in root’s 
SPATH. (To see why, read Recipe 14.10.) 


Solution 


Use the simple script in Example 14-3 to check your $PATH. Use it in 
conjunction with su or sudo to check paths for other users. 


Example 14-3. ch14/chkpath. 1 


#!/usr/bin/env bash 
# cookbook filename: chkpath.1 
# Check your $PATH for world-writable or missing directories 


exit_code=0 


for dir in S{PATH//:/ }; do 
[ -L "$dir" ] && printf "%b" "symlink, " 
if [ ! -d "$dir" ]; then 
printf "%b" "missing\t\t" 
(( exit_code++ )) 
elif [ -n "S(ls -lLd $dir | grep '‘d....... w. ')" ]; then 
printf "%b" "world writable\t" 
(( exit_code++ )) 
else 
printf "Xb" "ok\t\t" 
fi 
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printf "%b" "Sdir\n" 
done 
exit Sexit_code 


For example: 


# ./chkpath 

ok /usr/local/sbin 
ok /usr/local/bin 
ok /sbin 

ok /bin 

ok /usr/sbin 

ok /usr/bin 

ok /usr/X11R6/bin 
ok /root/bin 
missing /does_not_exist 


world writable /tmp 

symlink, world writable /tmp/bin 
symlink, ok /root/sbin 

# 


Discussion 


We convert the $PATH to a space-delimited list using the technique from 
Recipe 9.11, test for symbolic links (-L), and make sure the directory actually 
exists (-d). Then we get a long directory listing (- l), dereferencing symbolic 
links (-L) and listing the directory name only (-d), not the directory’s 
contents. Then we finally get to grep for world-writable directories. 


As you can see, we spaced out the ok directories, while directories with a 
problem may get a little cluttered. We also broke the usual rule of Unix tools 
being quiet unless there’s a problem, because we felt it was a useful 
opportunity to see exactly what is in your path and give it a once-over in 
addition to the automated check. 


We also provide an exit code of zero on success with no problems detected in 
the $PATH, or the count of errors found. With a little more tweaking, as in 
Example 14-4, we can add the file’s mode, owner, and group into the output, 
which might be even more valuable to check. 
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Example 14-4. ch14/chkpath.2 


#!/usr/bin/env bash 
# cookbook filename: chkpath.2 
# Check your $PATH for world-writable or missing directories, with 'stat' 


exit_code=0 


for dir in ${PATH//:/ }; do 
[ -L "$dir" ] && printf "%b" "symlink, " 
if [ ! -d "$dir" ]; then 
printf "%b" "missing\t\t\t\t" 
(( exit_code++ )) 


else 
stat=$(ls -lHd $dir | awk '{print $1, $3, $4}') 
Lf [ -n "$(echo Sstat | grep ‘disses. w. ')" ]; then 
printf "%b" "world writable\t$stat " 
(( exit_code++ )) 
else 
printf "%b" "ok\t\t$stat " 
fi 
fi 


printf "%b" "$dir\n" 


done 
exit $exit_code 


For example: 


# ./chkpath ; echo $? 


ok drwxr-xr-x root root /usr/local/sbin 
ok drwxr-xr-x root root /usr/local/bin 
ok drwxr-xr-x root root /sbin 

ok drwxr-xr-x root root /bin 

ok drwxr-xr-x root root /usr/sbin 

ok drwxr-xr-x root root /usr/bin 

ok drwxr-xr-x root root /usr/X11R6/bin 
ok drwx------ root root /root/bin 
missing /does_not_exist 
world writable drwxrwxrwt root root /tmp 

symlink, ok drwxr-xr-x root root /root/sbin 
2 

# 
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See Also 

= Recipe 9.11, “Finding a File Using a List of Possible Locations” 

= Recipe 14.10, “Adding the Current Directory to the $PATH” 

= Recipe 15.2, “Setting a POSIX $PATH” 

= Recipe 16.4, “Changing Your $PATH Permanently” 

= Recipe 16.5, “Changing Your $PATH Temporarily” 

= Recipe 19.3, “Forgetting That the Current Directory Is Not in the $PATH” 


14.10 Adding the Current Directory to the 
$PATH 


Problem 


Having to type ./script (the leading dot-slash all the time) is tedious, and 
you’d rather just add . (or an empty directory, meaning a leading or trailing : 
ora :: in the middle) to your $PATH. 


Solution 


We advise against doing this for any user, but we strongly advise against 
doing it for root. If you absolutely must do this, make sure the . comes last. 
Never do it as root. 


Discussion 


As you know, the shell searches the directories listed in $PATH when you 
enter a command name without a path. The reason not to add . is the same 
reason not to allow world-writable directories in your $PATH (see Recipe 14.9 
for how to find these). 


Say you are in /tmp and have . as the first thing in your $PATH. If you type Ls 
and there happens to be a file called /tmp/ls, you will run that file instead of 
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the /bin/Is you meant to run. Now what? Well, it depends. It’s possible (even 
likely, given the name that /⁄tmp/ls is a malicious script, and if you have just 
run it as root there is no telling what it could do, up to and including deleting 
itself when it’s finished to remove the evidence. 


So what if you put it last? Well, have you ever typed mc instead of mv? We 
have. So unless Midnight Commander is installed on your system, you could 
accidentally run ./mc when you meant /bin/mv, with the same consequences 
as just described. 


Just say no to dot! 


See Also 

= Section 2.13 of the Unix FAQs 

= Recipe 9.11, “Finding a File Using a List of Possible Locations” 

= Recipe 14.3, “Setting a Secure $PATH” 

= Recipe 14.9, “Finding World-Writable Directories in Your $PATH” 

= Recipe 15.2, “Setting a POSIX $PATH” 

= Recipe 16.4, “Changing Your $PATH Permanently” 

= Recipe 16.5, “Changing Your $PATH Temporarily” 

= Recipe 19.3, “Forgetting That the Current Directory Is Not in the $PATH” 


14.11 Using Secure Temporary Files 


Problem 


You need to create a temporary file or directory, but are aware of the security 
implications of using a predictable name. 


Solution 


Try using echo "~S$TMPDIR~" to see if your system provides a secure 
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temporary directory. We’re using the ~s as brackets so you see something if 
the variable is not set. 


The easy, portable, and “usually good enough” solution is to just use 
SRANDOM inline in your script. For example: 


# Make sure STMP is set to something 
[ -n "STMP" ] || TMP='/tmp' 


# Make a "good enough" random temp directory 
until [ -n "Stemp_dir" -a ! -d "Stemp_dir" ]; do 
temp_dir="/STMP/meaningful_prefix.${RANDOM}${RANDOM}${RANDOM}" 

done 
mkdir -p -m 0700 Stemp_dir 

|| { echo "FATAL: Failed to create temp dir 'Stemp_dir': $?"; exit 
100 } 

# Make a "good enough" random temp file 

until [ -n "Stemp_file" -a ! -e "Stemp file" ]; do 

temp_file="/STMP/meaningful_prefix.${RANDOM}${RANDOM}${RANDOM}" 

done 
touch $temp_file && chmod 0600 $temp_file 

|| { echo "FATAL: Failed to create temp file 'Stemp_file': $?"; exit 
101 } 


Even better, use both a random temporary directory and a random filename, 
as in Example 14-5! 


Example 14-5. chl4/make_temp 


# cookbook filename: make_temp 


# Make sure STMP is set to something 
[ -n "STMP" ] || TMP='/tmp' 


# Make a "good enough" random temp directory 
until [ -n "Stemp_dir" -a ! -d "Stemp_dir" ]; do 
temp_dir="/STMP/meaningful_prefix.${RANDOM}${RANDOM}${RANDOM}" 
done 
mkdir -p -m 0700 S$temp_dir \ 
|| { echo "FATAL: Failed to create temp dir 'Stemp_dir': $?"; exit 100; 
} 
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# Make a "good enough" random temp file in the temp dir 
temp_file="Stemp_dir/meaningful_prefix.${RANDOM}${RANDOM}${RANDOM}" 
touch $temp_file && chmod 0600 Stemp_file \ 

|| { echo "FATAL: Failed to create temp file 'Stemp_file': $?"; exit 
101; } 


No matter how you do it, don’t forget to set a trap to clean up (Example 14- 
6. As noted, $temp_dir must be set before this trap is declared, and its value 


must not change. If those things aren’t true, rewrite the logic to account for 
your needs. 


Example 14-6. ch14/clean_temp 

# cookbook filename: clean_temp 

# Do our best to clean up temp files no matter what 

# Note Stemp_dir must be set before this, and must not change! 


cleanup="rm -rf $temp_dir" 
trap "Scleanup" ABRT EXIT HUP INT QUIT 


WARNING 


SRANDOM is not available in dash, which is /bin/sh in some Linux distributions. 
Notably, current versions of Debian and Ubuntu use dash because it is smaller 
and faster than bash and thus helps to boot faster. But that means that /bin/sh, 
which used to be a symlink to bash, is now a symlink to dash instead, and 
various bash-specific features will not work. 


Discussion 


SRANDOM has been available since at least bash 2.0, and using it is probably 
good enough. Simple code is better and easier to secure than complicated 
code, so using SRANDOM may make your code more secure than having to deal 
with the validation and error-checking complexities of mktemp or 
/dev/urandom. You may also tend to use it more because it is so simple. 
However, SRANDOM provides only numbers, while mktemp provides numbers 
and upper- and lowercase letters, and urandom provides numbers and 
lowercase letters, thus vastly increasing the key space. 
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However you create it, using a temporary directory in which to work has the 
following advantages: 


m mkdir -p -m 0700 Stemp_dir avoids the race condition inherent in 
touch Stemp_ file && chmod 0600 Stemp_file. 


= Files created inside the directory are not even visible to a non-root 
attacker outside the directory when 0700 permissions are set. 


= A temporary directory makes it easy to ensure all of your temporary files 
are removed at exit. If you have temp files scattered about, there’s always 
a chance of forgetting one when cleaning up. 


=» You can choose to use meaningful names for temp files inside such a 
directory, which may make development and debugging easier, and thus 
improve script security and robustness. 


= Use of a meaningful prefix in the path makes it clear what scripts are 
running (this may be good or bad, but consider that ps or /proc do the 
same thing). More importantly, it might highlight a script that has failed to 
clean up after itself, which could possibly lead to an information leak. 


Example 14-5 advises using a meaningful_prefix in the pathname you are 
creating. Some people will undoubtedly argue that since that is predictable, it 
reduces the security. It’s true that part of the path is predictable, but we still 
feel the advantages we’ ve outlined outweigh this objection. If you still 
disagree, simply omit the meaningful prefix. 


Depending on your risk and security needs, you may want to use random 
temporary files inside the random temporary directory, as we did in our 
example. That will probably not do anything to materially increase security, 
but if it makes you feel better, go for it. 


We talked about a race condition in touch S$temp_file&&chmod 
0600Stemp_file. One way to avoid that is to do this: 


saved_umask=$( umask) 
umask 077 

touch $temp_file 
umask $Ssaved_umask 
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unset saved_umask 


We recommended using both a random temporary directory and a random (or 
semi-random) filename since it provides more overall benefits. 


If the numeric-only nature of SRANDOM really bothers you, consider 
combining some other sources of pseudounpredictable and pseudorandom 
data and a hash function: 


nice_long_random_string=$( (last ; who ; netstat -a ; free ; date \ 
; echo SRANDOM) | mdSsum | cut -d' ' -f1 ) 


WARNING 


We do not recommend using the fallback method shown here because the 
additional complexity is probably a cure that is worse than the disease. But it’s 
an interesting look at a way to make things a lot harder than they need to be. 


A theoretically more secure approach is to use the mktemp utility present on 
many modern systems, with a fallback to /dev/urandom, also present on many 
modern systems, or even SRANDOM. The problem is that mktemp and 
/dev/urandom are not always available, and dealing with that in practice in a 
portable way is much more complicated than our solution. Example 14-7 is 
one way it could look, but try to use something simpler if possible. 


Example 14-7. chl4/MakeTemp 


# cookbook filename: MakeTemp 
# Function to incorporate or source into another script 
HALEFEELEFEELE EE EL EEE EEE ET EEE HH HHHH EEE HHH H+H EEE TELE EEEEE EEE EEE EE HS 


Try to create a secure temp file name or directory 
Called like: Stemp_file=$(MakeTemp <file|dir> [path/to/name-prefix]) 
Returns the name of secure temp file name or directory in $STEMP_NAME 


# 
# 
# 
# For example: 


+ 


Stemp_dir=$(MakeTemp dir /tmp/SPROGRAM. foo) 
Stemp_file=$(MakeTemp file /tmp/SPROGRAM. foo) 


$ 
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É 
function MakeTemp { 


# Make sure $TMP is set to something 
[ -n "STMP" ] || TMP='/tmp' 


local type_name=$1 

local prefix=${2:-STMP/temp} # Unless prefix is defined, use STMP + 
temp 

local temp_type='' 

local sanity_check='' 


case $type_name in 


file ) 
temp_type='' 
ur_cmd='touch' 
# -f Regular file -r Readable 
# -w Writable -0 Owned by me 


sanity_check='test -f STEMP_NAME -a -r STEMP_NAME \ 
-a -w STEMP_NAME -a -O $TEMP_NAME' 
33 


dir|directory ) 


temp_type='-d' 

ur_cmd='mkdir -p -m0700' 

# -d Directory -r Readable 

# -w Writable -x Searchable -0 Owned 
by me 

sanity_check='test -d STEMP_NAME -a -r $TEMP_NAME \ 

-a -w STEMP_NAME -a -x STEMP_NAME -a -0 
STEMP_NAME' 
* ) Error "\nBad type in $PROGRAM:MakeTemp! Needs file|dir." 1 3; 
esac 


# First try mktemp 
TEMP_NAME=$(mktemp Stemp_type ${prefix}.XXXXXXXXX) 


# If that fails try urandom, if that fails give up 
if [ -z "STEMP_NAME" ]; then 
TEMP_NAME="S{prefix}.$(cat /dev/urandom | od -x | tr -d ' ' 
head -1)" 
Sur_cmd STEMP_NAME 
fi 


463 


# Make sure the file or directory was actually created, or DIE 
if ! eval Ssanity_check; then 
Error \ 
"\aFATAL ERROR: can't make temp $type_name with 
'SO:MakeTemp$*'!\n" 2 
else 
echo "STEMP_NAME" 
fi 


} # end of function MakeTemp 


See Also 


m man mktemp 

m Recipe 14.13, “Setting Permissions” 

m Recipe 15.3, “Developing Portable Shell Scripts” 

= /ttp://en.wikipedia.org/wiki/Debian_Almquist_shell 
= Appendix B, particularly ./scripts.noah/mktmp.bash 


14.12 Validating Input 


Problem 


You’ve asked for input (e.g., from a user or a program), and to ensure 
security or data integrity you need to make sure you got what you asked for. 


Solution 


There are various ways to validate your input, depending on what the input is 
and how strict you need to be. 


Use pattern matching for simple “it matches or it doesn’t” situations (see 
Recipes 6.6, 6.7, and 6.8): 


[[ "Sraw_input" == *.jpg ]] && echo "Got a JPEG file." 
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Use a case statement, as in Example 14-8, when there are various things that 
might be valid (see Recipes 6.14 and 6.15. 


Example 14-8. ch14/validate_using_case 


# cookbook filename: validate_using_case 


case $raw_input in 


* company.com ) # Probably a local hostname 

eae, ) # Probably a JPEG file 

*.[49][pP] [96] ) # Probably a JPEG file, case-insensitive 
foo i bsf ) # Entered 'foo' or 'bar 

[0-9][0-9]1[0-9] ) # A 3-digit number 


[a-z][a-z][a-z][a-z] ) # A 4-lowercase-char word 
* ) # None of the above 


esac 


Use a regular expression when pattern matching isn’t specific enough and 
you have bash version 3.0+ (see Recipe 6.8). This example 1s looking for a 
three- to six-alphanumeric-character filename with a .jpg extension (case- 
sensitive): 


[[ "Sraw_input" =~ [[:alpha:]]{3,6}\.jpg ]] && echo "Got a JPEG file." 


Discussion 


For a larger and more detailed example, see the examples/scripts/shprompt in 
a recent bash tarball. Note this was written by Chet Ramey, who maintains 
bash: 


# shprompt -- give a prompt and get an answer satisfying certain 
criteria 

# 

# shprompt [-dDfFsy] prompt 
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prompt for string 

prompt for filename 

prompt for full pathname to a file or directory 
prompt for a directory name 

prompt for a full pathname to a directory 
prompt for y or n answer 


# HHH H E = 
< ganha 


+ 


# Chet Ramey 
# chet@ins.CWRU. Edu 


For a similar example, see examples/scripts.noah/y_or_n_p.bash, written 
circa 1993 by Noah Friedman and later converted to bash version 2 syntax by 
Chet Ramey. Also in the examples, see ./functions/isnum2 and 
/functions/isvalidip. 


See Also 


= Recipe 3.5, “Getting User Input” 

= Recipe 3.6, “Getting Yes or No Input” 

= Recipe 3.7, “Selecting from a List of Options” 

= Recipe 3.8, “Prompting for a Password” 

= Recipe 6.6, “Testing for Equality” 

= Recipe 6.7, “Testing with Pattern Matches” 

= Recipe 6.8, “Testing with Regular Expressions” 

= Recipe 6.14, “Branching Many Ways” 

= Recipe 6.15, “Parsing Command-Line Arguments” 
= Recipe 11.2, “Supplying a Default Date” 

= Recipe 13.6, “Parsing Text with a read Statement” 
= Recipe 13.7, “Parsing with read into an Array” 


= Appendix B for bash examples 
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14.13 Setting Permissions 


Problem 


You want to set permissions in a secure manner. 


Solution 

If you need to set exact permissions for security reasons (or you are sure that 
you don’t care what is already there, and you just need to change it, use 
chmod with four-digit octal modes: 


chmod 0755 some_script 


If you only want to add or remove permissions, but need to leave other 
existing permissions unchanged, use the + and - operations in symbolic 
mode: 


chmod +x some_script 


If you try to recursively set permissions on all the files in a directory structure 
using something like chmod -R 0644 some_directory then you’ll regret it 
because you’ve now rendered any subdirectories nonexecutable, which 
means you won’t be able to access their content, cd into them, or traverse 
below them. Use find and xargs with chmod to set the files and directories 
individually. For file permissions: 


find some_directory -type f -printO® | xargs -0 chmod 0644 
For directory permissions: 
find some_directory -type d -printO® | xargs -@ chmod 0755 


Of course, if you only want to set permissions on the files in a single 
directory (non-recursive), just cd in there and set them. 
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When creating a directory, use mkdir -m mode new_directory since you 
not only accomplish two tasks with one command, but you avoid any 
possible race condition between creating the directory and setting the 
permissions. 


Discussion 


Many people are in the habit of using three-digit octal modes, but we like to 
use all four possible digits to be explicit about what we mean to do with all 

attributes. We also prefer using octal mode when possible because it’s very 

clear what permissions you are going to end up with. You may also use the 

absolute operation (= in symbolic mode if you like, but we’re traditionalists 

who like the old octal method best. 


Ensuring the final permissions when using the symbolic mode and the + or - 
operations is trickier since they are relative, not absolute. Unfortunately, there 
are many cases where you can’t simply arbitrarily replace the existing 
permissions using octal mode. In such cases you have no choice but to use 
symbolic mode, often using + to add a permission while not disturbing other 
existing permissions. Consult your specific system’s chmod for details, and 
verify that your results are as you expect. Here are a few examples: 


$ ls -l 
-rw-r--r--1 jp users 0 Dec 1 02:09 script.sh 
$ 


# Make file readable, writable, and executable for the owner using 
octal notation 

$ chmod 0700 script.sh 

-[WX------ 1 jp users © Dec 1 02:09 script.sh 

# Make file readable and executable for everyone using symbolic 
notation 


$ chmod ugo+trx *.sh 


$ ls -l 
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-fwxr-xr-x 1 jp users 0 Dec 1 02:09 script.sh 


Note in the last example that although we added (+) rx to everyone (ugo), the 
owner still has write (w) permission. That’s what we wanted to do here, and 
that is often the case. But do you see how, in a security setting, it might be 
easy to make a mistake and allow an undesirable permission to slip through 
the cracks? That’s why we like to use the absolute octal mode if possible, and 
of course we always check the results of our command. 


In any case, before you adjust the permissions on a large group of files, 
thoroughly test your command. You may also want to back up the 
permissions and owners of the files. See Recipe 17.8 for details. 


See Also 


m man chmod 

m man find 

m man xargs 

= Recipe 9.2, “Handling Filenames Containing Odd Characters” 
= Recipe 17.8, “Capturing File Metadata for Recovery” 


14.14 Leaking Passwords into the Process List 


Problem 


ps may show passwords entered on the command line in the clear. For 
example: 


$ ./cheesy_app -u user -p password & 
[1] 13301 


$ ps 


PID TT STAT TIME COMMAND 
5280 pO S 0:00.08 -bash 
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9784 pO R+ 0:00.00 ps 
13301 pO S 0:00.01 /bin/sh ./cheesy_app -u user -p password 


Solution 


Try really hard not to use passwords on the command line. 


Discussion 
Really. Don’t do that. 


Many applications that provide a -p or similar switch will also prompt you if 
a password is required and you do not provide it on the command line. That’s 
great for interactive use, but not so great in scripts. You may be tempted to 
write a trivial “wrapper” script or an alias to try and encapsulate the password 
on the command line. Unfortunately, that won’t work since the command is 
eventually run and so ends up in the process list anyway. If the command can 
accept the password on STDIN, you may be able to pass it in that way: 


./bad_app < ~/.hidden/bad_apps_password 


That creates other problems, but at least avoids displaying the password in 
the process list. 


If that won’t work, you’ll need to either find a new app, patch the one you are 
using, or just live with it. 


See Also 


= Recipe 3.8, “Prompting for a Password” 


= Recipe 14.20, “Using Passwords in Scripts” 


14.15 Writing setuid or setgid Scripts 
Problem 
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You have a problem you think you can solve by using the setuid or setgid bit 
on a shell script. 


Solution 


Use Unix groups and file permissions and/or sudo to grant the appropriate 
users the least privileges they need to accomplish their tasks. 

Using the setuid or setgid bit on a shell script will create more problems— 
especially security problems—than it solves. Some systems (such as Linux 
don’t even honor the setuid bit on shell scripts, so creating setuid shell scripts 
creates an unnecessary portability problem in addition to the security risks. 


Discussion 


setuid root scripts are especially dangerous, so don’t even think about it. Use 
sudo. 


setuid and setgid have a different meaning when applied to directories than 
they do when applied to executable files. When one of these is set on a 
directory it causes any newly created files or subdirectories to be owned by 
the directory’s owner or group, respectively. 


Note you can check a file to see if it is setuid by using test -u and check to 
see if it is setgid by using test -g: 


$ mkdir suid_dir sgid_dir 


$ touch suid_file sgid_file 


$ ls -l 

total 4 

drwxr-xr-x 2 jp users 512 Dec 9 03:45 sgid_dir 
-rw-r--r-- 1 jp users 0 Dec 9 03:45 sgid_file 
drwxr-xr-x 2 jp users 512 Dec 9 03:45 suid_dir 
-rw-r--r-- 1 jp users 0 Dec 9 03:45 suid_file 


$ chmod 4755 suid_dir suid_file 
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$ chmod 2755 sgid_dir sgid_file 


$ ls -l 

total 4 

drwxr-sr-x 2 jp users 512 Dec 9 03:45 sgid_dir 
-rwxr-sr-x 1 jp users 0 Dec 9 03:45 sgid_file 
drwsr-xr-x 2 jp users 512 Dec 9 03:45 suid_dir 
-rwsr-xr-x 1 jp users 0 Dec 9 03:45 suid file 


$ [ -u suid_dir ] && echo 'Yup, suid' || echo 'Nope, not suid' 
Yup, suid 
$ [ -u sgid_dir ] && echo 'Yup, suid' || echo 'Nope, not suid' 


Nope, not suid 


$ [ -g sgid_file ] && echo 'Yup, sgid' || echo 'Nope, not sgid' 
Yup, sgid 
$ [ -g suid_file ] && echo 'Yup, sgid' || echo 'Nope, not sgid' 


Nope, not sgid 


See Also 


m man chmod 

= Recipe 14.18, “Running as a Non-root User” 
= Recipe 14.19, “Using sudo More Securely” 
= Recipe 14.20, “Using Passwords in Scripts” 


= Recipe 17.15, “Using sudo on a Group of Commands” 


14.16 Restricting Guest Users 


The material concerning the restricted shell in this recipe also appears in 
Learning the bash Shell, 3rd Edition, by Cameron Newham (O’Reilly). 


Problem 


You need to allow some guest users on your system and need to restrict what 
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they can do. 


Solution 


Avoid using shared accounts if possible, since you lose accountability and 

create logistical headaches when users leave and you need to change the 

password and inform the other users. Create separate accounts with the least 

possible permissions necessary to do whatever is needed. Consider using: 

=m A chroot jail, as discussed in Recipe 14.17 

= SSH to allow noninteractive access to commands or resources, as 
discussed in Recipe 14.21 


m bash’s restricted shell 


Discussion 


The restricted shell is designed to put the user into an environment where 
their ability to move around and write files is severely limited. It’s usually 
used for guest accounts. You can make a user’s login shell restricted by 
putting rbash in the user’s /etc/passwd entry if this option was included when 
bash was compiled. 


The specific constraints imposed by the restricted shell disallow the user from 
doing the following: 


= Changing working directories. cd is inoperative. If you try to use it, you 
will get the error message cd: restricted from bash. 


= Redirecting output to a file. The redirectors >, >|, <>, and >> are not 
allowed. 


= Assigning a new value to the environment variables SENV, SBASH_ENV, 
SSHELL, or SPATH. 


= Specifying any commands with slashes (/) in them. The shell will treat 
files outside of the current directory as “not found.” 


= Using the exec builtin. 
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= Specifying a filename containing a / as an argument to the . (source) 
builtin command. 


= Importing function definitions from the shell environment at startup. 


= Adding or deleting builtin commands with the -f and -d options to the 
enable builtin command. 


m Specifying the -p option to the command builtin command. 
= Turning off restricted mode with set +r. 


These restrictions go into effect after the user’s .bash_profile and 
environment files are run. In addition, it is wise to change the owner of the 
user’s .bash_profile and .bashrc files to root, and make these files read-only. 
The user’s home directory should also be made read-only. 


This means that the restricted shell user’s entire environment is set up in 
/etc/profile and .bash_profile. Since the user can’t access /etc/profile and 
can’t overwrite .bash_ profile, this lets the system administrator configure the 
environment as they see fit. It’s also a good idea that the last command in the 
startup file be a cd to some other directory, usually a subdirectory of the 
user’s $HOME for an extra layer of protection. 


Two common ways of setting up such environments are to set up a directory 
of safe commands and have that directory be the only one in $PATH, and to 
set up a command menu from which the user can’t escape without exiting the 
shell. 


WARNING 
The restricted shell is not proof against a determined attacker. It can also be 
difficult to lock down as well as you think you have, since many common 


applications, such as vi and Emacs, allow shell escapes that might bypass the 
restricted shell entirely. 


Used wisely it can be a valuable additional layer of security, but it should not be | 
the only layer. | 
| 


474 


Note that the original Bourne shell has a restricted version called rsh, which 
may be confused with the so-called r-tools (rsh, rcp, rlogin, etc. remote shell 
program, which is also rsh. The very insecure rsh has been mostly replaced 
(we most sincerely hope by ssh (the Secure Shell. 


See Also 
= Recipe 14.17, “Using chroot Jails” 
= Recipe 14.21, “Using SSH Without a Password” 


14.17 Using chroot Jails 


Problem 


You have to use a script or application that you don’t trust. 


Solution 


Consider placing it in a so-called chroot jail. The chroot command changes 
the root directory of the current process to the directory you specify, then 
returns a shell or execs a given command. That has the effect of placing the 
process, and thus the program, into a jail from which it theoretically can’t 
escape to the parent directory. So if that application is compromised or 
otherwise does something malicious, it can only affect the small portion of 
the filesystem you restricted it to. In conjunction with running as a user with 
very limited rights, this is a very useful layer of security to add. 


Unfortunately, covering all the details of chroot is beyond the scope of this 
recipe, since it would probably require a whole separate book. We present it 
here to promote awareness of the functionality. 


Discussion 


So why doesn’t everything run in chroot jails? Because many applications 
need to interact with other applications, files, directories, or sockets all over 
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the filesystem. That’s the tricky part about using chroot jails; the application 
can’t see outside of its walls, so everything it needs must be inside those 
walls. The more complicated the application, the more difficult it is to run in 
a jail. 

Some applications that must inherently be exposed to the internet, such as 
DNS (e.g., BIND, web, and mail (e.g., Postfix servers, may be configured 

to run in chroot jails with varying degrees of difficulty. See the 
documentation for the distribution and specific applications you are running 
for details. 


Another interesting use of chroot is during system recovery. Once you have 
booted from a LiveCD and mounted the root filesystem on your hard drive, 
you may need to run a tool such as LILO or GRUB which, depending on 
your configuration, might need to believe it’s really running on the damaged 
system. If the LiveCD and the installed system are not too different, you can 
usually chroot into the mount point of the damaged system and fix it. That 
works because all the tools, libraries, configuration files, and devices already 
exist in the jail, since they really are a complete (if not quite working system. 
You might have to experiment with your $PATH in order to find things you 
need once you’ve chrooted though (that’s an aspect of the “if the LiveCD and 
the installed system are not too different” caveat. 


On a related note, the NSA’s Security Enhanced Linux (SELinux 
implementation of Mandatory Access Control (MAC may be of interest. 
MAC provides a very granular way to specify at a system level what is and is 
not allowed, and how various components of the system may interact. The 
granular definition is called a security policy and it has a similar effect to a 
jail, in that a given application or process can do only what the policy allows 
it to do. 


Red Hat Linux has incorporated SELinux into its enterprise product. Novell’s 
SUSE product has a similar MAC implementation called AppArmor, and 
there are similar implementations for Solaris, BSD, and macOS. 


See Also 
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m man chroot 

= /Attps://selinuxproject.org 

a /Attp://en.wikipedia.org/wiki/Mandatory_access_control 
= Attp://olivier.sessink.nl/jailkit/ 


a /Attp://www.jmcresearch.com/projects/jail/ 


14.18 Running as a Non-root User 


Problem 


You'd like to run your scripts as a non-root user, but are afraid you won’t be 
able to do the things you need to do. 


Solution 


Run your scripts under non-voot user IDs, either as you or as dedicated users, 
and run interactively as non-root, but configure sudo to handle any tasks that 
require elevated privileges. 


Discussion 


sudo may be used in a script as easily as it may be used interactively. See the 
sudoers NOPASSWD option especially (see Recipe 14.19 and the sudoers 
manpage). 


See Also 


m man sudo 

m man sudoers 

m Recipe 14.15, “Writing setuid or setgid Scripts” 
m Recipe 14.19, “Using sudo More Securely” 
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= Recipe 14.20, “Using Passwords in Scripts” 


= Recipe 17.15, “Using sudo on a Group of Commands” 


14.19 Using sudo More Securely 


Problem 


You want to use sudo but are worried about granting too many people too 
many privileges. 


Solution 


Good! You should be worrying about security. While using sudo is much 
more secure than not using it, the default settings may be greatly improved. 


Take the time to learn a bit about sudo itself and the /etc/sudoers file. In 
particular, learn why in most cases you should not be using the ALL=(ALL) 
ALL specification! Yes, that will work, but it’s not even remotely secure. The 
only difference between that and just giving everyone the root password is 
that they don’t know the root password; they can still do everything root can 
do. sudo logs the commands it runs, but that’s trivial to avoid by using sudo 
bash. 


Second, give your needs some serious thought. Just as you shouldn’t be using 
the ALL=(ALL) ALL specification, you probably shouldn’t be managing users 
one by one either. The sudoers utility allows for very granular management, 
and we strongly recommend using it. man sudoers provides a wealth of 
material and examples, especially the section on preventing shell escapes. 


Third, the sudoers file has a NOPASSWD tag that, as you might expect, allows 
user accounts to perform privileged operations without first having to enter 
their user passwords. This is one way to allow automation requiring root 
access without leaving plain-text passwords all over the place, but it’s also 
obviously a double-edged sword. 


sudoers allows for four kinds of aliases: user, runas, host, and command. 
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Judicious use of them as roles or groups will significantly reduce the 
maintenance burden. For instance, you can set up a User_ALias for 
BUILD_USERS, then define the machines those users need to run on with 
Host_ALias and the commands they need to run with Cmnd_Alias. If you set 
a policy to only edit /etc/sudoers on one machine and copy it around to all 
relevant machines periodically using scp with public-key authentication, you 
can set up a very secure yet usable system of least privilege. 


TIP 


When sudo asks for your password, it’s really asking for your password. As in, 


your user account. Not root. For some reason people often get confused by this 
at first. 


Discussion 


Unfortunately, sudo is not installed by default on every system. It is usually 
installed on Linux, macOS, and OpenBSD; other systems will vary. You 


should consult your system’s documentation and install it if it’s not already 
there. 


You should always use visudo to edit your /etc/sudoers file. Like vipw, visudo 
locks the file so that only one person can edit it at a time, and it performs some 
syntax sanity checks before replacing the production file so that you don’t 
accidentally lock yourself out of your system. 


WARNING | 
| 
| 


See Also 


m man sudo 
™ man sudoers 


m man visudo 
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= SSH, The Secure Shell: The Definitive Guide, 2nd Edition, by Daniel J. 
Barrett, Richard Silverman, and Robert G. Byrnes (O’Reilly) 


= Recipe 14.15, “Writing setuid or setgid Scripts” 
= Recipe 14.18, “Running as a Non-root User” 
= Recipe 14.20, “Using Passwords in Scripts” 


= Recipe 17.15, “Using sudo on a Group of Commands” 


14.20 Using Passwords in scripts 


Problem 


You need to hardcode a password in a script. 


Solution 


This is obviously a bad idea and should be avoided whenever possible. 
Unfortunately, sometimes it isn’t possible to avoid it. 

The first way to try to avoid doing this is to see if you can use sudo with the 
NOPASSWD option to avoid having to hardcode a password anywhere. This 
obviously has its own risks, but is worth checking out. See Recipe 14.19 for 
more details. 

Another alternative may be to use SSH with public keys and ideally restricted 
commands (see Recipe 14.21). 

If there is no other way around it, about the best you can do is put the user ID 
and password in a separate file that is readable only by the user who needs it, 
then source that file when necessary (Recipe 10.3). Leave that file out of 
revision control, of course. 


Discussion 


Accessing data on remote machines in a secure manner is relatively easy 
using SSH (see Recipe 14.21 and Recipe 15.11). It may even be possible to 
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use that SSH method to access other data on the same host, but it’s probably 
much more efficient to use sudo for that. But what about accessing data in a 
remote database, perhaps using some SQL command? There is not much you 
can do in that case. 


Yes, you say, but what about crypt or the other password hashes? The 
problem is that the secure methods for storing passwords all involve using 
what’s known as a one-way hash. The password checks in, but it can’t check 
out. In other words, given the hash, there is theoretically no way to get the 
plain-text password back out. And that plain-text password is the point—we 
need it to access our database or whatever. So secure storage is out. 


That leaves insecure storage, but the problem here is that it may actually be 
worse than plain text because it might give you a false sense of security. If it 
really makes you feel better, and you promise not to get a false sense of 
security, go ahead and use ROT 13 or something to obfuscate the password: 


ROT13=$(echo password | tr 'A-Za-z' 'N-ZA-Mn-za-m') 


ROT13 only handles ASCII letters, so you could also use ROT47 to handle 
some punctuation as well: 


ROT47=$(echo password | tr '!-~' 'P-~!-0') 


WARNING 
We really can’t stress enough that ROT13 and ROT47 are nothing more than 
“security by obscurity” and thus are not security at all. They are better than 
nothing, if and only if you (or your management) do not get a false sense that 
you are “secure” when you are not. Just be aware of your risks. Having said 
that, the reality is sometimes the benefit outweighs the risk. 


See Also 
a Attp://en.wikipedia.org/wiki/ROT13 
= Recipe 10.3, “Using Configuration Files in a Script” 
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= Recipe 14.15, “Writing setuid or setgid Scripts” 

= Recipe 14.18, “Running as a Non-root User” 

= Recipe 14.19, “Using sudo More Securely” 

= Recipe 14.21, “Using SSH Without a Password” 

= Recipe 15.11, “Getting Input from Another Machine” 


m Recipe 17.15, “Using sudo on a Group of Commands” 


14.21 Using SSH Without a Password 


Problem 


You need to use SSH or scp in a script and would like to do so without using 
a password. Or you’re using them in a cron job and can’t have a password.! 


WARNING 


SSH1 (the protocol) and ssh/ (the executable) are deprecated and considered 
less secure than the newer SSH2 protocol as implemented by OpenSSH and 
SSH Communications Security. We strongly recommend using SSH2 with 
OpenSSH and will not cover SSH1 here. 


Solution 


There are two ways to use SSH without a password: the wrong way and the 
right way. The wrong way is to use a public key that is not encrypted by a 
passphrase. The right way is to use a passphrase-protected public key with 
ssh-agent or keychain. 


We assume you are using OpenSSH; if not, consult your documentation (the 
commands and files will be similar). 


First, you need to create a key pair if you don’t already have one. Only one 
key pair is necessary to authenticate you to as many machines as you 
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configure, but you may decide to use more than one key pair, perhaps for 
personal and work reasons. The pair consists of a private key that you should 
protect at all costs, and a public key (*.pub that you can post on a billboard if 
you like. The two are related in a complex mathematical way such that they 
can identify each other, but you can’t derive one from the other. 


Use ssh-keygen (this might be ssh-keygen2 if you’re not using OpenSSH to 
create a key pair. -t is used to specify the type; consult your system’s man 
page for possible values. -b is optional and specifies the number of bits in the 
new key (2,048 is the default for RSA keys at the time of this writing. -C 
allows you to specify a comment, but it defaults to user@hostname if you 
omit it. We recommend using -t rsa -b 4096 -C meaningful comment 
and we recommend strongly against using no passphrase. ssh-keygen also 
allows you to change your key file’s passphrase or comment: 


$ ssh-keygen --help 
unknown option -- - 
usage: ssh-keygen [options] 
Options: 
-A Generate non-existent host keys for all key types. 
-a number Number of KDF rounds for new key format or moduli 
primality tests. 


-B Show bubblebabble digest of key file. 

-b bits Number of bits in the key to create. 

-C comment Provide new comment. 

-C Change comment in private and public key files. 
-D pkcs11 Download public key from pkcs11 token. 

-e Export OpenSSH to foreign format key file. 


-F hostname Find hostname in known hosts file. 
-f filename Filename of the key file. 


-G file Generate candidates for DH-GEX moduli. 

-g Use generic DNS resource record format. 

-H Hash names in known_hosts file. 

-h Generate host certificate instead of a user certificate. 
-I key _id Key identifier to include in certificate. 

-i Import foreign format to OpenSSH key file. 


-J number Screen this number of moduli lines. 
-j number Start screening moduli at specified line. 
-K checkpt Write checkpoints to this file. 
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-k Generate a KRL file. 

-L Print the contents of a certificate. 

-l Show fingerprint of key file. 

-M memory Amount of memory (MB) to use for generating DH-GEX 
moduli. 

-m key_fmt Conversion format for -e/-i (PEM|PKCS8|RFC4716). 

-N phrase Provide new passphrase. 


-n name,... User/host principal names to include in certificate 
-0 option Specify a certificate option. 

-0 Enforce new private key format. 

-P phrase Provide old passphrase. 

-p Change passphrase of private key file. 

-Q Test whether key(s) are revoked in KRL. 

-q Quiet. 


-R hostname Remove host from known_hosts file. 

-r hostname Print DNS resource record. 

-S start Start point (hex) for generating DH-GEX moduli. 
-s ca_key Certify keys with CA key. 


-T file Screen candidates for DH-GEX moduli. 
-t type Specify type of key to create. 
-u Update KRL rather than creating a new one. 


-V from:to Specify certificate validity interval. 

-V Verbose. 

-W gen Generator to use for generating DH-GEX moduli. 
-y Read private key file and print public key. 

-Z cipher Specify a cipher for new private key format. 
-z serial Specify a serial number. 


$ ssh-keygen -v -t rsa -b 4096 -C 'This is my new key' 
Generating public/private rsa key pair. 

Enter file in which to save the key (/home/jp/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 

Enter same passphrase again: 

Your identification has been saved in /home/jp/.ssh/id_rsa. 
Your public key has been saved in /home/jp/.ssh/id_rsa.pub. 
The key fingerprint is: 

eb: b3:0b:3a:d8:9f :d0:02:5d:99:ce:69:98:ef:f0:0c This is my new key 
The key's randomart image is: 

+--[ RSA 4096]----+ 

| | 


| | 
| + | 
| | 
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$ $ ls -l ~/.ssh/id_rsa* 
-[W------- 1 jp jp 3.3K Aug 27 15:10 /home/jp/.ssh/id_rsa 
-fw-r--r-- 1 jp jp 744 Aug 27 15:10 /home/jp/.ssh/id_rsa.pub 


$ fold -w75 ~/.ssh/id_rsa.pub 

ssh-rsa 

AAAAB3NzaC1yc2EAAAADAQABAAACAQCrxvI jPrLxx9VgkEQuBfdiGGZ5KC380yTB477 
MFyw4W7JMDnN5p7Yx8dvl91Fuc1i3U+RsuBBqw jNvB6hHesdwr /6D2EgoTGIDbegNNLa+qb8 
jJtX 

ZK1s+B9sk9SoILT4AF5wEAMag0K4JmvOv /xFHwWVRM1BfUEQQIVP7Z8v56e7HWz/pZMbOtM8 
9WMg 
ITyJh6cuTG1XHRmYxpOoaPBEKeDXTMOmMfyAQwO2yQt6fl29RW1DH5J+jVYarsWScGe6SKSY 
GQPZ 

L7a3KRkbpGPRdVK2CY2P1tXQLnh9hPYqvHtAzXUMY JpSwBkNzRN3A571FBtNUxLGtP+xHNE 
N7Kz 

WpUsT1wv6DQw/ /UDSHIZSHVUHMKp414y6dwmKgXTtqVWXYbB /t2EU+CuWk80kLA2Tv7dKUn 
n8tA 
87D1LU3hAhr58jDEzXbIfL9yYhV2xHBxVUDfF80Lv9p9ZKngRx8hkj8MoDr0I6EqL3IhWKRq 
RdJy 

GwkAy jCk5UQ9EH/SQ3NjhJE1Qb3100dgE3ZKXfmM8VXBZSOXTH4OHjd9RA4VCQWJEpdR2QUg 
eSXW 

aM94v3p606n jKT6FFXV36S33/F /ROc1ivZlcIDTpRCbpCXRNkgPtDAImMBNmmweaYBOYm3wqH 
RB2I 

bnw5vftDpptndB774sV2FcRxptkM8Pd/vRS35q56FSgcT6Q== This is my new key 


Once you have a key pair, add your public key to the ~/.ssh/authorized_keys 
file in your home directory on any other machines to which you wish to 
connect using this key pair. You can use scp, cp with a floppy or USB key, or 
simple cut-and-paste from terminal sessions to do that. The important part is 
that it all ends up on a single line. While you can do it all in one command 
(e.g., scp id_dsa.pub remote_host:.ssh/ authorized_keys), we don’t 
recommend that even when you’re “absolutely sure” that authorized_keys 
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doesn’t exist. Instead, you can use a slightly more complicated but much 
safer command: 


$ ssh remote_host "echo $(cat ~/.ssh/id_rsa.pub) >> 
~/.ssh/authorized_keys" 
jp@remote_host's password: 


$ ssh remote_host 
Last login: Thu Dec 14 00:02:52 2006 from openbsd.jpsdomain.org 
NetBSD 2.0.2 (GENERIC) #0: Wed Mar 23 08:53:42 UTC 2005 


Welcome to NetBSD! 


$ exit 
Logout 
Connection to remote_host closed. 


As you can see, we were prompted for a password for the initial scp, but after 
that ssh just worked. What isn’t shown here is the use of ssh-agent, which 
cached the passphrase to the key so that we didn’t have to type it. 


The command shown here also assumes that ~/.ssh exists on both machines. 
If not, create it using mkdir -m 0700 -p ~/.ssh. Your ~/ssh directory 
must be mode 0700 or OpenSSH will complain. It’s not a bad idea to use 
chmod 0600 ~/.ssh/authorized_keys as well. 


It’s also worth noting that we’ve just set up a one-way relationship. We can 
SSH from our local host to our remote host with no password, but the same is 
not true in reverse, due to both lack of the private key and lack of the agent 
on the remote host. You can simply copy your private key all over the place 
to enable a “web of passwordless SSH,” but that complicates matters when 
you want to change your passphrase and it makes it harder to secure your 
private key. If possible, you are better off having one well-protected and 
trusted machine from which you ssh out to remote hosts as needed. 


The SSH agent is clever and subtle in its use. We might argue it’s too clever. 
The way it is intended to be used in practice is via an eval and command 
substitution: eval 'ssh-agent'. That creates two environment variables so 
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that ssh or scp can find the agent and ask it about your identities. That’s very 
slick, and it’s well documented in many places. The only problem is that this 
is unlike any other program in common use (except some of the features of 
less; see Recipe 8.15 and is totally obscure to a new or uninformed user. 


If you just run the agent, it prints out some details and looks like it worked. 
And it did, in that it’s now running. But it won’t actually do anything, 
because the necessary environment variables were never actually set. We 
should also mention in passing that the handy -k switch tells the agent to exit. 


Here are some examples of incorrect and correct ways to use the SSH agent: 


# The Wrong Way to Use the Agent 

# Nothing in the environment 

$ set | grep SSH 

$ ssh-agent 

SSH_AUTH_SOCK=/tmp/ssh-bACKp27592/agent.27592; export SSH_AUTH_SOCK; 
SSH_AGENT_PID=24809; export SSH_AGENT_PID; 

echo Agent pid 24809; 


# Still nothing 

$ set | grep SSH 

# Can't even kill it, because -k needs $SSH_AGENT_PID 
$ ssh-agent -k 

SSH_AGENT_PID not set, cannot kill agent 


# Is it even running? Yes 


$ ps x 

PID TT STAT TIME COMMAND 
24809 ?? Is 0:00.01 ssh-agent 
22903 po I 0:03.05 -bash (bash) 
11303 pO R+ 0:00.00 ps -x 


$ kill 24809 


$ ps x 

PID TT STAT TIME COMMAND 
22903 pO I 0:03.06 -bash (bash) 
30542 pO R+ 0:00.00 ps -x 


# This is correct 
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$ eval `ssh-agent` 
Agent pid 21642 


# Hey, it worked! 

$ set | grep SSH 

SSH_AGENT_PID=21642 
SSH_AUTH_SOCK=/tmp/ssh-ZfEsa28724/agent.28724 


# Kill it - the wrong way 

$ ssh-agent -k 

unset SSH_AUTH_SOCK; 

unset SSH_AGENT_PID; 

echo Agent pid 21642 killed; 


# Oops, the process is dead but it didn't clean up after itself 
$ set | grep SSH 

SSH_AGENT_PID=21642 
SSH_AUTH_SOCK=/tmp/ssh-ZfEsa28724/agent.28724 


# The Right Way to Use the Agent 
$ eval `ssh-agent` 
Agent pid 19330 


$ set | grep SSH 
SSH_AGENT_PID=19330 
SSH_AUTH_SOCK=/tmp/ssh-fwxMfj4987/agent.4987 


$ eval `ssh-agent -k` 
Agent pid 19330 killed 


$ set | grep SSH 
$ 


Intuitive, isn’t it? Not. Very slick, very efficient, very subtle, yes. User- 
friendly, not so much. 


Once we have the agent running as expected we have to load our identities 
using the ssh-add command. That’s very easy: we just run it, optionally with 
a list of key files to load. It will prompt for all the passphrases needed. In this 
example we don’t list any keys, so it just uses the default as set in the main 
SSH configuration file: 
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$ ssh-add 

Enter passphrase for /home/jp/.ssh/id_rsa: 

Identity added: /home/jp/.ssh/id_rsa (/home/jp/.ssh/id_rsa) 
$ 


So now we can use SSH interactively, in this particular shell session, to log in 
to any machine we’ve previously configured, without a password or 
passphrase. So what about other sessions, scripts, or cron? 


Use Daniel Robbins’s keychain script, which: 


acts as a frontend to ssh- agent and ssh-add, but allows you to easily 
have one long-running ssh-agent process per system, rather than the 
norm of one ssh-agent per login session. 


This dramatically reduces the number of times you need to enter your 
passphrase. With keychain, you only need to enter a passphrase once 
every time your local machine is rebooted. 


Keychain also makes it easy for remote cron jobs to securely “hook in” to 
a long-running ssh-agent process, allowing your scripts to take 
advantage of key-based logins. 


keychain is a clever, well-written, and well-commented shell script that 
automates and manages the otherwise tedious process of exporting those 
environment variables we discussed earlier into other sessions. It also makes 
them available to scripts and cron. But you’re probably saying to yourself, 
wait a second here, you want me to leave all my keys in this thing forever, 
until the machine reboots? Well, yes, but it’s not as bad as it sounds. 


First of all, you can always kill it, though that will also prevent scripts or cron 
from using it. Second, there is a - -clear option that flushes cached keys 
when you log in. Sound backward? It actually makes sense. Here are the 
details, from keychain’s author (first published by IBM developerWorks; see 
http://www.ibm.com/developerworks/linux/library/l-keyc2/): 


I explained that using unencrypted private keys is a dangerous practice, 
because it allows someone to steal your private key and use it to log in to 
your remote accounts from any other system without supplying a 
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password. Well, while keychain isn’t vulnerable to this kind of abuse (as 
long as you use encrypted private keys, that is), there is a potentially 
exploitable weakness directly related to the fact that keychain makes it so 
easy to “hook in” to a long-running ssh-agent process. What would 
happen, I thought, if some intruder were somehow able to figure out my 
password or pass-phrase and log into my local system? If they were 
somehow able to log in under my username, keychain would grant them 
instant access to my decrypted private keys, making it a no-brainer for 
them to access my other accounts. 


Now, before I continue, let’s put this security threat in perspective. If some 
malicious user were somehow able to log in as me, keychain would 
indeed allow them to access my remote accounts. Yet, even so, it would be 
very difficult for the intruder to steal my decrypted private keys since they 
are still encrypted on disk. Also, gaining access to my private keys would 
require a user to actually log in as me, not just read files in my directory. 
So, abusing ssh-agent would be a much more difficult task than simply 
stealing an unencrypted private key, which only requires that an intruder 
somehow gain access to my files in ~/.ssh, whether logged in as me or not. 
Nevertheless, if an intruder were successfully able to log in as me, they 
could do quite a bit of additional damage by using my decrypted private 
keys. So, if you happen to be using keychain on a server that you don’t log 
into very often or don’t actively monitor for security breaches, then 
consider using the --clear option to provide an additional layer of 
security. 


The --clear option allows you to tell keychain to assume that every new 
login to your account should be considered a potential security breach 
until proven otherwise. When you start keychain with the - -clear option, 
keychain immediately flushes all your private keys from ssh-agent’s 
cache when you log in, before performing its normal duties. Thus, if you’re 
an intruder, keychain will prompt you for passphrases rather than giving 
you access to your existing set of cached keys. However, even though this 
enhances security, it does make things a bit more inconvenient and very 
similar to running ssh-agent all by itself, without keychain. Here, as is 
often the case, one can opt for greater security or greater convenience, but 
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not both. 


Despite this, using keychain with --clear still has advantages over using 
ssh-agent all by itself; remember, when you use keychain --clear, 
your cron jobs and scripts will still be able to establish passwordless 
connections; this is because your private keys are flushed at login, not 
logout. Since a logout from the system does not constitute a potential 
security breach, there’s no reason for keychain to respond by flushing 
ssh-agent’s keys. Thus, the - -clear option [is] an ideal choice for 
infrequently accessed servers that need to perform occasional secure 
copying tasks, such as backup servers, firewalls, and routers. 


To actually use the keychain-wrapped ssh-agent from a script or cron, simply 
source the file keychain creates from your script. keychain can also handle 
GPG keys: 


[ -r ~/.ssh-agent ] && source ~/.ssh-agent \ 
|| { echo "keychain not runnin" >&2 ; exit 1; } 


Discussion 


When using SSH in a script, you don’t want to be prompted to authenticate or 
have extraneous warnings displayed. The -q option will turn on quiet mode 
and suppress warnings, while -o 'BatchMode yes' will prevent user 
prompts. Obviously if there is no way for SSH to authenticate itself, it will 
fail, since it can’t even fall back to prompting for a password. But that 
shouldn’t be a problem since you’ve made it this far in this recipe. 


SSH is an amazing, wonderful tool and there is a lot to it—enough to fill 
another book about this size. We highly recommend SSH, The Secure Shell: 
The Definitive Guide, 2nd Edition, by Daniel J. Barrett, Richard Silverman, 
and Robert G. Byrnes (O’ Reilly) for everything you ever wanted to know 
(and more) about SSH. 


Using public keys between OpenSSH and SSH2 Server from SSH 
Communications Security can be tricky; see Chapter 6 in Linux Security 
Cookbook by the same authors (O’ Reilly) for tips. 
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The IBM developerWorks articles on SSH by keychain author (and Gentoo 
Chief Architect Daniel Robbins are also a great reference 
(http://www.ibm.com/developerworks/linux/library/-keyc. html, 
http://www.ibm.com/developerworks/linux/library/-keyc2/, 
http://www.ibm.com/developerworks/linux/library/-keyc3/. 


If keychain doesn’t seem to be working, or if it works for a while then seems 
to stop, you may have another script somewhere else rerunning ssh-agent and 
getting things out of sync. Check the following and make sure the PIDs and 
socket all agree: 


$ ps -ef | grep [s]sh-agent 
jp17364 0.0 0.0 3312 1132? S  Dec16 0:00 ssh-agent 


$ cat ~/.keychain/SHOSTNAME-sh 
SSH_AUTH_SOCK=/tmp/ssh-UJc17363/agent.17363; export SSH _AUTH_SOCK; 
SSH_AGENT_PID=17364; export SSH_AGENT_PID; 


$ set | grep SSH_A 
SSH_AGENT_PID=17364 
SSH_AUTH_SOCK=/tmp/ssh-UJc17363/agent.17363 


NOTE 


Depending on your operating system, you may have to adjust your ps 
command; if -ef doesn’t work, try -eu. 


KEY FINGERPRINTS 


All flavors of SSH support fingerprints to facilitate key comparison and verification 
for both user and host keys. As you may guess, bit-by-bit verification of long, 
seemingly random data is tedious and error prone at best, and virtually impossible at 
worst (say, over the phone). Fingerprints provide an easier way to perform this 
verification. You may have seen fingerprints in other applications, especially 
PGP/GPG keys. 


The reason to verify keys in the first place is to prevent so-called man in the middle 
attacks. If Alice sends her key to Bob, he must make sure that the key he receives is 
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actually from Alice, and that Eve has not intercepted it and sent her own key instead. 
This requires an out-of-band communications channel, such as a telephone. 


There are two fingerprint formats, the traditional hex format from PGP and a newer, 
supposedly easier to read format called bubblebabble. When Bob receives Alice’s key, 
he calls her up and reads her the fingerprint. If they match, they both know he has the 
correct key. 


$ ssh-keygen -l -f ~/.ssh/id_rsa 

4096 eb:b3:0b:3a:d8:9f:d0:02:5d:99:ce:69:98:ef:f0:0c This is my new key 
(RSA) 

$ ssh-keygen -l -f ~/.ssh/id_rsa.pub 

4096 eb:b3:0b:3a:d8:9f:d0:02:5d:99:ce:69:98:ef:f0:0c This is my new key 
(RSA) 

$ ssh-keygen -B -f ~/.ssh/id_rsa 


4096 xuked-dutis-hoper-berag-ducut-tycuc-salur-ruvin-kefeg-mobyg-nyxyx 
This is my new key (RSA) 


$ ssh-keygen -B -f ~/.ssh/id_rsa.pub 


4096 xuked-dutis-hoper-berag-ducut-tycuc-salur-ruvin-kefeg-mobyg-nyxyx 
This is my new key (RSA) 


See Also 


a http:/www.funtoo.org/Keychain 

€u https://www.ibm.com/developerworks/linux/library/l-keyc/index. html 
€ /Attp://www.ibm.com/developerworks/linux/library/-keyc2/ 

= https:/www.ibm.com/developerworks/linux/library/l-keyc3/ 


= SSH, The Secure Shell: The Definitive Guide, 2nd Edition, by Daniel J. 
Barrett, Richard Silverman, and Robert G. Byrnes (O’Reilly) 


m Linux Security Cookbook by Daniel J. Barrett, Richard Silverman, and 
Robert G. Byrnes (O’Reilly) 


= Practical Cryptography by Niels Ferguson and Bruce Schneier (Wiley) 
= Applied Cryptography by Bruce Schneier (Wiley) 
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= Recipe 8.15, “Doing More with less” 


14.22 Restricting SSH Commands 


Problem 


You’d like to restrict what an incoming SSH user or script can do.’ 


Solution 


Edit the ~/ssh/authorized_keys file, use SSH forced commands, and 
optionally disable unnecessary SSH features. For example, suppose you want 
to allow an rsync process without also allowing interactive use. 


First, you need to figure out exactly what command is being run on the 
remote side. Create a key (Recipe 14.21) and add a forced command to tell 
you. Edit the ~/ssh/authorized_keys file and add: 


command="/bin/echo Command was: $SSH_ORIGINAL_COMMAND" 
before the key. It will look something like this, all on one line: 


command="/bin/echo Command was: $SSH_ORIGINAL_COMMAND" ssh-dss 

AAAAB3NzaC1kc3MAAAEBANpgvvTs Lst2m0ZIJAQayhhiMqa3aWwU3kfvOm9+myFZ9VeFSxM7 
IVxIjWFALQH3 jpLY+Q78FMZCTiG+ZrGZYn8adZ9yg5wACO3KXm2vKt8LfTx61+qkMR7v15N 
I7tZyhxGah5qHNehReFWLuk7IXCtRr zZRVWMdsHcL2SA1Y4fI9Y9FFVLBdE1Er+ZIucSxI lo 
6D1HF jKjt3wjbAal+oJxwZJaupZ0Q7N47uwMs LmcSELQBRNDsaogFRK Ler ZASPQ5P+AH/+C 
xa/fCGYwsogX$JJOH5S7+QIIJHFze35YZ1I+A1D3B1a4IBfikKvtoaFrSbMdhVAkChdAdMj096 
xhbdEAAAAVAISKzCEsrUo3KAvyU08KVD6e0B /NAAAA/3uUAx2TIB/M9MmPq jeH6 7MhSY5NaV 
WuMqwebDIXuvKQQDMUU4EP jRGmMS89H L8UKANOCq/C1T+0Gzn4zrbE06COSm3SRMP24HyIbE 
LhLWV49sfLROSQmh9FRL1sS7ZdcUrxkDkr 2J60n5cMVB9OM2nI LOOIHRVLd5RxPO1u81yqvhv 
E610RdA6IMjzXcQ8ebuD2R7330370GFD7e207DaabKKkHZIduL/zFbQkzMDK6uUAMP8y LRIN 
OfUsqIhHhtc/160T 2H6nMUO9MccxZTFUF GF 8xIOndELP6um4 jXYkK5Q301/CtU3TZyvNeWVw 
yGwDi4wg2 jeVeQYHU2RhHZcZpwAAAQEAV2086701U9s IuRi jp8s04h13eZrsE5rdn6auLl/mk 
m+xALO+WQeDXRONM9BWVSrNEmIJB74tEJL3qQTMEFoCON9KpOOYa7Qt8n4gZOvcZlI5u+cg 
ydimKaggS2SnoorsRLb2LhHpe6mXus8pUTF5QT8apgXM3TgFSLDT+3rCt40IdGCZLaP+UDB 
UNUSKf FwCr u6uGoXEwxaL O8NviwZ0c19qrcOYzp7i33m613a0Z9Pu+TPHGYC74QmBbWq8U9 
DAo+7yhRIhqfdJzk3vIKSLbCxg4PbMwx2Qfh4dLk+L7wOasKn15//W+RWBUrOLaZ1ZP1/az 
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sKONcygno/@Flew== This is my new key 
Now execute your command and see what the result is: 


$ ssh remote_host 'ls -l /etc' 
Command was: ls -l /etc 


$ 


Now, the problem with this approach is that it will break a program like rsync 
that depends on having the STDOUT/STDIN channel all to itself: 


$ rsync -avzL -e ssh remote_host:/etc . 

protocol version mismatch -- is your shell clean? 

(see the rsync manpage for an explanation) 

rsync error: protocol incompatibility (code 2) at compat.c(64) 


$ 
But we can work around that by modifying our forced command as follows: 
command="/bin/echo Command was: $SSH_ORIGINAL_COMMAND >> ~/ssh_command" 
On the client side we try again: 


$ rsync -avzL -e ssh 192.168.99.56:/etc . 

rsync: connection unexpectedly closed (0 bytes received so far) 
[receiver | 

rsync error: error in rsync protocol data stream (code 12) at io0.c(420) 


$ 
And on the remote host side we now have: 


$ cat ../ssh_command 
Command was: rsync --server --sender -vlLogDtprz . /etc 


$ 


So we can update our forced command as necessary. 


Two other things we can do are set a from host restriction and disable SSH 
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commands. The host restriction specifies the hostname or IP address of the 
source host. Disabling commands 1s also pretty intuitive: 


no-port-forwarding,no-X11- forwarding, no-agent- forwarding ,no-pty 


When we put it all together, it looks like this (still all on one giant line: 


no-port-forwarding,no-X11- forwarding, no-agent- forwarding, no- 

pty, from="Local_ 

client" ,command="rsync --server --sender -vlLogDtprz . /etc" ssh-dss 
AAAAB3NzaC1kc3MAAAEBANpgvvTs Lst2mOZIJAQayhhiMqa3aWwU3kfvOm9+myFZ9vVeFSxM7 
IVxIjWFALQH3 jpLY+Q78FMZCT1iG+ZrGZYn8adZ9yg5wACO3KXm2vKt8LfTx61+qkMR7v15N 
I7tZyhxGah5qHNehReFWLuk7IXCtRr ZRVWMdsHcL2SA1Y4fI9Y9FFVLBdE1Er+ZIucSxI lo 
6D1HF jKjt3wjbAal+oJxwZJaupZ0Q7N47uwMs LmcSELQBRNDsaogFRK Ler ZASPQ5P+AH/+C 
xa/fCGYwsogX$JJQH5S7+QIIJHFze35YZ1I+A1D3B1a4IBfikKvtoaFr5SbMdhVAkChdAdMj096 
xhbdEAAAAVAISKzCEsrUo3KAvyU08KVD6e0B/NAAAA/3uUAx2TIB/M9MmPq jeH6 7MhS5SY5NaV 
WuMqwebDIXuvKQQDMUU4EP jRGmMS89H L8UKANOCq/C1T+0Gzn4zrbE06COSm3SRMP24HyIbE 
LhLWV49sfLROSQmh9FRL1s7ZdcUrxkDkr2J60n5cMVB9OM2nI LOOIHRVLd5RxPO1u81yqvhv 
E610RdA6IMjzXcQ8ebuD2R7330370GFD7e207DaabKKkHZIduL/zFbQkzMDK6UAMP8y LRIN 
OfUsqIhHhtc/160T 2H6nMUO9MccxZTFUF GF 8xIOndELP6um4 jXYkK5Q301/CtU3TZyvNeWVw 
yGwDi4wg2 jeVeQYHU2RhHZcZpwAAAQEAV2086701U9s IuRi jp8s04h13eZrsE5rdn6aul/mk 
m+xALO+WQeDXRONM9OBWVSrNEmIJB74tEJL3qQTMEFoCON9KpOOYa7Qt8n4gZOvcZlI5u+cg 
ydimKaggS2SnoorsRLb2LhHpe6mXus8pUTF5QT8apgXM3TgFSLDT+3rCt40IdGCZLaP+UDB 
UNUSKf FwCru6uGoXEwxaLO8Nvi1wZ0c19qrc0Yzp7i133m6i13a0Z9Pu+TPHGYC74QmBbWq8U9 
DAo+7yhRIhqfdJzk3vIKSLbCxg4PbMwx2Qfh4dLk+L7wOasKn15//W+RWBUrOLaZ1ZP1/az 
sKONcygno/OFiew== This is my new key 


Discussion 


If you have any problems with ssh, the -v option is very helpful. ssh -v or 
ssh -v -v will almost always give you at least a clue about what’s going 
wrong. Give them a try when things are working to get an idea of what their 
output looks like. 


If you’d like to be a little more open about what the key can and can’t do, 
look into the OpenSSH restricted shell, rssh, which supports scp, sftp, rdist, 
rsync, and cvs. 


You’d think restrictions like these would be easy, but it turns out they are not. 
The problem has to do with the way SSH (and the r-commands before it) 
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actually works. It’s a brilliant idea and it works very well, except that it’s 
hard to limit. To vastly oversimplify it, you can think of SSH as connecting 
your local STDOUT to STDIN on the remote side and the remote STDOUT 
to your local STDIN, so all things like scp or rsync do is stream bytes from 
the local machine to the remote machine as if over a pipe. But that very 
flexibility precludes SSH from being able to restrict interactive access while 
allowing scp. There’s no difference. And that’s why you can’t put lots of 
echo and debugging statements in your bash configuration files (see Recipe 
16.21; that output will intermingle with the byte stream and cause havoc. 


So how does rssh work? It provides a wrapper that you use instead of a 
default login shell (like bash in /etc/passwd. That wrapper determines what 
it will and will not allow, but with much more flexibility than a plain old 
SSH-restricted command. 


See Also 


a SSH, The Secure Shell: The Definitive Guide, 2nd Edition, by Daniel J. 
Barrett, Richard Silverman, and Robert G. Byrnes (O’ Reilly) 


= Linux Security Cookbook by Daniel J. Barrett, Richard Silverman, and 
Robert G. Byrnes (O’Reilly) 


= Recipe 14.21, “Using SSH Without a Password” 
= Recipe 16.21, “Creating Self-Contained, Portable rc Files” 


14.23 Disconnecting Inactive Sessions 


Problem 


You’d like to be able to automatically log out inactive users, especially root. 


Solution 


Set the STMOUT environment variable in /etc/bashrc or ~/.bashrc to the 
number of seconds of inactivity before ending the session. In interactive 
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mode, once a prompt is issued, if the user does not enter a command in 
STMOUT seconds, bash will exit. 


Discussion 


STMOUT is also used in the read builtin and the se/ect command in scripts. 


Don’t forget to set this as a read-only variable in a system-level file such as 
/etc/profile or /etc/bashrc to which users have no write access if you don’t 
want them to be able to change it: 


declare -r TMOUT=3600 


# Or: 
readonly TMOUT=3600 


WARNING 
Since users have control over their own environments, you cannot totally rely 
on $TMOUT, even if you set it as read-only: the user could just run a different 
shell, or even a difference instance of bash itself! Think of it as a helpful 
reminder to cooperative users, especially knowledgeable and interrupt-driven 
system administrators who may get distracted (constantly). 


See Also 
= Recipe 16.21, “Creating Self-Contained, Portable rc Files” 


! We thank Daniel Barrett, Richard Silverman, and Robert Byrnes for their inspiration 
and excellent work in SSH, The Secure Shell: The Definitive Guide—especially 
Chapters 2, 6, and 11—and Linux Security Cookbook, without which this recipe would 
be a mere shadow of itself. 

2 We thank Daniel Barrett, Richard Silverman, and Robert G. Byrnes for their inspiration 
and excellent work in SSH, The Secure Shell: The Definitive Guide (especially Chapters 
2, 6, and 11) and Linux Security Cookbook, without which this recipe would be a mere 
shadow of itself. 
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Chapter 15. Advanced Scripting 


Unix and POSIX have long promised compatibility and portability, and long 
struggled to deliver it. Thus, one of the biggest problems for advanced 
scripters is writing scripts that are portable; 1.e., that can work on any 
machine that has bash installed. Writing scripts that run well on a wide 
variety of platforms is much more difficult than we wish it were. There are 
many variations from one system to another that can get in the way; for 
example, bash itself isn’t always installed in the same place, and many 
common Unix commands have slightly different options (or give slightly 
different output) depending on the operating system. In this chapter, we’ ll 
look at several of those problems and show you how to solve them with bash. 


Many of the other things that are periodically needed are not as simple as 
we'd like them to be, either. So, we’ ll also cover solutions for additional 
advanced scripting tasks, such as automating processes using phases, sending 
email from your script, logging to sys/og, using your network resources, and 
a few tricks for getting input and redirecting output. 


Although this chapter is about advanced scripting, we’d like to stress the need 
for clear code, written as simply as possible and documented. Brian 
Kernighan, one of the first Unix developers, put it well: 


Debugging is twice as hard as writing the code in the first place. 
Therefore, if you write the code as cleverly as possible, you are, by 
definition, not smart enough to debug it. 


It’s easy to write very clever shell scripts that are very difficult, if not 
impossible, to understand. The more clever you think you’re being now, as 
you solve the problem de jour, the more you’ ll regret it 6, 12, or 18 months 
from now when you (or worse yet, someone else) have to figure out what you 
did and why it broke. If you have to be clever, at least document how the 
script works! (See Recipe 5.1.) 
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15. 1 Finding bash Portably for #! 


Problem 


You need to run a bash script on several machines, but bash is not always in 
the same place (see Recipe 1.14). 


Solution 


Use the /usr/bin/env command in the shebang line, as in #! /usr/bin/env 
bash. If your system doesn’t have env in /usr/bin, ask your system 
administrator to install it, move it, or create a symbolic link because this is 
the required location. 


You could also create symbolic links for bash itself, but using env is the 
canonical and correct solution. 


Discussion 


env’s purpose is to “run a program in a modified environment,” but since it 
will search the path for the command it is given to run, it works very well for 
this use. 


You may be tempted to use !/bin/sh instead. Don’t. If you are using bash- 
specific features in your script, they will not work on machines that do not 
use bash in Bourne shell mode for /bin/sh (e.g., BSD, Solaris, Ubuntu 6.10+). 
And even if you aren’t using bash-specific features now, you may forget 
about that in the future. If you are committed to using only POSIX features, 
by all means use !/bin/sh (and don’t develop on Linux; see Recipe 15.3), 
but otherwise be specific. 


You may sometimes see a space between #! and /bin/whatever. 
Historically there were some systems that required the space, though in 
practice we haven’t seen one in a long time. It’s very unlikely any system 
running bash will require the space, and leaving it out seems to be the most 
common usage now. But for the utmost historical compatibility, use the 
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space. 


We have chosen to use #! /usr/bin/env bash in the longer scripts and 
functions we’ve made available to download (see the end of the Preface for 
details, because that will run unchanged on most systems. However, since 
env uses the $PATH to find bash, this is arguably a security issue (see Recipe 
14.2, albeit a minor one in our opinion. 


WARNING 


Ironically, since we’re trying to use env for portability, shebang line processing 
is not consistent across systems. Many systems, including Linux, allow only a 
single argument to the interpreter. Thus, #!/usr/bin/env bash - will result 
in the error: 


/usr/bin/env: bash -: No such file or directory 


This is because the interpreter is /usr/bin/env and the single allowed argument is 
bash -. Other systems, such as BSD and Solaris, don’t have this restriction. 


Since the trailing - is a common security practice (see Recipe 14.2) and since 
this is supported on some systems but not others, this is a security and 
portability problem. 


You can use the trailing - for a tiny bit more security at the cost of portability, 
or omit it for portability at the cost of a tiny potential security risk. Since env is 
searching the path anyway, using it should probably be avoided if you have 
security concerns; thus, the inability to portably use the trailing - is tolerable. 


Therefore, our advice is to omit the - when using env for portability, and to 
hardcode the interpreter and trailing - when security is critical. 


See Also 


m For information on the shebang line (/usr/bin/env): 
— http://srfi.schemers.org/srfi-22/mail-archive/msg00069. html 


— http://www. in-ulm.de/~mascheck/various/shebang/ 
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— http://homepages.cwi.nl/~aeb/std/hashexclam-1.html 
— Section 3.16 of the Unix FAQs 

= Recipe 1.14, “Getting bash for xBSD” 

= Recipe 15.2, “Setting a POSIX $PATH” 

= Recipe 15.3, “Developing Portable Shell Scripts” 

m Recipe 15.6, “Using echo Portably” 


15.2 Setting a POSIX $PATH 


Problem 


You are on a machine that provides older or proprietary tools (e.g., Solaris) 
and you need to set your path so that you get POSIX-compliant tools. 


Solution 
Use the getconf utility: 


PATH=$(PATH=/bin: /usr/bin getconf PATH) 
Here are some default and POSIX paths on several systems: 


# Red Hat Enterprise Linux (RHEL) 4.3 

$ echo $PATH 

/usr/kerberos/bin: /usr/local/bin: /bin: /usr/bin: /usr/X11R6/bin: /home/SUS 
ER/bin 


$ getconf PATH 
/bin: /usr/bin 


# Debian Sarge 
$ echo $PATH 
/usr/local/bin: /usr/bin: /bin: /usr/bin/X11: /usr/games 


$ getconf PATH 
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/bin: /usr/bin 


# Solaris 10 
$ echo $PATH 
/usr/bin: 


$ getconf PATH 
/usr/xpg4/bin: /usr/ccs/bin: /usr/bin: /opt/SUNWspro/bin 


# OpenBSD 3.7 

$ echo $PATH 

/home/SUSER/bin: /bin: /sbin:/usr/bin:/usr/sbin:/usr/X11R6/bin:/usr/local 
/bin: /usr/ 

local/sbin: /usr/games 


$ getconf PATH 
/usr/bin: /bin: /usr/sbin: /sbin: /usr/X11R6/bin: /usr/local/bin 


Discussion 


getconf reports various system configuration variables, so you can use it to 
set a default path. However, unless getconf itself is a builtin, you will need a 
minimal path to find it, hence the PATH=/bin: /usr/bin part of the solution. 


In theory, the variable you use should be CS_PATH. In practice, PATH worked 
every-where we tested while CS_PATH failed on the BSDs. 


See Also 


m “Shell Corner: Processing Command-line Arguments with my_getopts” in 
the January 2003 issue of Unix Review 


= Recipe 9.11, “Finding a File Using a List of Possible Locations” 

= Recipe 14.3, “Setting a Secure $PATH” 

= Recipe 14.9, “Finding World-Writable Directories in Your $PATH” 
= Recipe 14.10, “Adding the Current Directory to the $PATH” 

= Recipe 16.4, “Changing Your $PATH Permanently” 
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= Recipe 16.5, “Changing Your $PATH Temporarily” 
= Recipe 19.3, “Forgetting That the Current Directory Is Not in the $PATH” 


15.3 Developing Portable Shell Scripts 


Problem 


You are writing a shell script that will need to run on multiple versions of 
multiple Unix or POSIX operating systems. 


Solution 


First, try using the command builtin with its -p option to find the POSIX 
version of program. For example, in /usr/xpg4 or /usr/xpg6 on Solaris: 


command -p program args 


Then, if possible, find the oldest or least capable Unix machine you have 
access to and develop the script on that platform. If you aren’t sure what the 
least capable platform is, use a BSD variant or Solaris (and the older a 
version you can find, the better). 


Discussion 


command -p uses a default path that is guaranteed to find all of the POSIX- 
standard utilities. If you’re sure your script will only ever run on Linux 
(famous last words), then don’t worry about it; otherwise, avoid developing 
cross-platform scripts on Linux or Windows (e.g., via Cygwin). 


The problems with writing cross-platform shell scripts on Linux are: 


1. /bin/sh is not the Bourne shell; it’s really /bin/bash in POSIX mode, 
except when it’s /bin/dash (for example, Ubuntu 6.10+). Both are very 
good, but not perfect, and none of the three work exactly the same, 
which can be very confusing. In particular, the behavior of echo can 
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change. 
2. Linux uses the GNU tools instead of the original Unix tools. 


Don’t get us wrong, we love Linux and use it every day. But it isn’t really 
Unix: it does some things differently, and it has the GNU tools. The GNU 
tools are great, and that’s the problem. They have a lot of switches and 
features that aren’t present on other platforms, and your script wi// break in 
odd ways no matter how careful you are about that. Conversely, Linux is so 
compatible with everything that scripts written for any other Unix-like 
systems will almost always run on it. They may not be perfect (e.g., echo’s 
default behavior is to display \n instead of printing a newline), but they’re 
often good enough. 


There is an ironic Catch-22 here—the more shell features you use, the less 
you have to depend on external programs that may or may not be there or 
work as expected. While bash is far more capable than sh, it’s also one of the 
tools that may or may not be there. Some form of sh will be on virtually any 
Unix or Unix-like system, but it isn’t always quite what you think it is. 


Another Catch-22 is that the GNU long options are much more readable in 
shell code, but are often not present on other systems. So instead of being 
able to say sort --field-separator=, unsorted_file > sorted_file, 
you have to use sort -t, unsorted_file>sorted_file for portability. 


But take heart: developing on a non-Linux system is easier than it’s ever 
been. If you already have and use such systems, then this is obviously a 
nonissue. But if you don’t have such systems in-house, it’s now trivial to get 
them for free. Solaris and the BSDs all run in virtual environments (see 
Recipe 15.4). 


If you have a Mac running macOS (previously OS X), then you already have 
bash and BSD so you’re all set. You might want to make sure you have a 
recent version, though; see Recipe 1.15. 


You can also easily test scripts using a virtualization environment (see Recipe 
15.4). The flaw in this solution is the systems such as AIX and HP-UX that 
don’t run on an x86 architecture, and thus don’t run under x86 virtualization. 
Again, if you have these systems, use them. If not, see Recipe 1.18. 
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TIP 


Debian and Ubuntu users should install the devscripts package (aptitude 
install devscripts), which provides a checkbashisms script to help find 
“bashisms” that will not work in dash. Users of other operating systems and/or 
Linux distributions should see if that is available for their system. 


See Also 


=» help command 

= /Attp://en.wikipedia.org/wiki/Debian_Almquist_shell 

a Attp://en.wikipedia.org/wiki/Bash 

= /Attp://partnerweb.vmware.com/GOSIG/Solaris_11.html 
a /Attp://www.polarhome.com/ 

= comp.sys.hp.hpux FAQ 

= History of Unix 

= Unix History repo 

= Recipe 1.18, “Getting bash Without Getting bash” 

= Recipe 15.4, “Testing Scripts Using Virtual Machines” 
m Recipe 15.6, “Using echo Portably” 


= “echo Options and Escape Sequences” in Appendix A 


15.4 Testing Scripts Using Virtual Machines 


Problem 


You need to develop cross-platform scripts but do not have the appropriate 
systems or hardware. 
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Solution 


If the target platforms run on the x86 architecture, use one of the many free 
and commercial virtualization solutions and build your own test virtual 
machine (VM or search for prebuilt virtual machines on the OS vendor or 
distributor’s site, or the internet. Or use a free (for a trial period or low-cost 
VM from a cloud vendor. 


The flaw in this solution is the systems such as AIX and HP-UX that don’t 
run on an x86 architecture, and thus don’t run under x86 virtualization. 
Again, if you have these systems, use them. If not, see Recipe 1.18. 


Discussion 


Testing shell scripts is usually not very resource-intensive, so even moderate 
hard-ware capable of running VirtualBox or a similar virtualization package 
should be fine. We mention VirtualBox specifically because it’s without cost, 
runs on Linux, macOS, and Windows, is used in countless examples around 
the web and tools such as Vagrant, and is flexible and easy to use; but there 
are certainly other alternatives available. 


Minimal virtual machines with 128 MB of RAM, or sometimes even less, 
should be more than enough for a shell environment for testing. Set up an 
NFS share to store your test scripts and data, and then simply SSH to the test 
system. Debian is a good place to start if you are building your own; just 
remember to uncheck everything you can during the install. 


There are a great many pre-built VMs available on the internet, but quality 
and security will vary. If you are testing at work, be sure to check your 
corporate policies; many companies prohibit bringing “random internet 
downloads” into the corporate network. On the other hand, your company 
may build or provide its own VM images for internal use. You will probably 
want only a very minimal VM for testing shell scripts, but the definition of 
“minimal” will also vary greatly among different sources. You’ll need to do a 
little research to find a fit for your needs. Some good places to start are: 


=» TurnKey Linux 
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= VMware 
m OSBoxes 
» KVM 

= Parallels 


Depending on your needs and corporate policy, you may also be able to get a 
free or low-cost VM in the cloud. See Recipe 1.18 for details about getting an 
almost free shell account from /ttp://polarhome.com, which has a tiny, 
symbolic one-time fee, or another vendor. 


Amazon has a “free tier” offering that may be useful, and it and many other 
vendors like Linode and Digital Ocean have very inexpensive pay-as-you-go 
options. 


Don’t forget about just booting a LiveCD/LiveDVD either, as we mentioned 
in Recipe 1.18. 


Finally, if all that is not enough, the initiator of the QEMU emulator, Fabrice 
Bellard, has written a PC emulator in JavaScript that lets you boot VM 
images with just a web browser! 


No matter what option you choose, there will be a lot more information, 
documentation, and how-to guides available on the internet than we can fit in 
this recipe. Our main goal here is just to get you thinking about some 
possibilities. 


WARNING 


Be sure to check your corporate policies before doing anything in this recipe! 


See Also 


= /Attps://www.virtualbox.org/ 
a /Attps://www.debian.org/distrib/netinst 


= /Attps://www.turnkeylinux.org/core (bash 4.3 or newer) 
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http://www.vmware.com/ (commercial) 


— https://solutionexchange.vmware.com/store/category_groups/virtual- 
appliances 


https://www.osboxes.org/ 

http://www. linux-kvm.org 

— http://xmodulo.com/use-kvm-command-line-debian-ubuntu. html 
— http://www. thegeekstuff.com/2014/1 0/linux-kvm-create-guest-vm/ 
http://polarhome.com 

https://shells.red-pill.eu/ 

https://aws.amazon.com/free/ 

https://www.linode.com/pricing 
https://www.digitalocean.com/pricing/ 

http://wiki.qgemu.org 

http://copy.sh/v86/ 

Recipe 1.14, “Getting bash for xBSD” 

Recipe 1.18, “Getting bash Without Getting bash” 


15.5 Using for Loops Portably 


Problem 


You need to do a for loop but want it to work on older versions of bash. 


Solution 
This method is portable back to bash 2.04+: 


$ for ((i=0; i<10; i++)); do echo $i; done 
(0) 
1 
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WO OANA MN HRW YN 


Discussion 


There are nicer ways of writing this loop in newer versions of bash, but they 
are not backward compatible. As of bash 3.0+ you can use the syntax for 
{x..y}, as in: 


$ for i in {1..10}; do echo $i; done 
1 
2 
3 
4 
5 
6 
7 
8 
9 
1 


0 
If your system has the seg command, you could also do this: 


$ for i in $(seq 1 10); do echo $i; done 
1 
2 
3 
4 
5 
6 
7 
8 
9 
1 


0 
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See Also 
m help for 


m man seq 

m Recipe 6.12, “Looping with a Count” 

= Recipe 6.13, “Looping with Floating-Point Values” 
= Recipe 17.24, “Writing Sequences” 


15.6 Using echo Portably 


Problem 


You are writing a script that will run on multiple versions of Unix and Linux 
and you need echo to behave consistently even if it is not running on bash. 


Solution 


Use printf "%b" whatever, or test for the system and set xpg_echo using 
shopt -s xpg_echo as needed. 


If you omit the "%b" format string (for example, printf whatever), then 
printf will try to interpret any % characters in whatever, which is probably 
not what you want. The "%b" format is an addition to the standard printf 
format that will prevent that misinterpretation and also expand backslash 
escape sequences in whatever. 


Setting xpg_echo is less consistent since it only works on bash. It can be 
effective if you are sure that you'll only every run under bash, and not under 
sh or another similar shell that doesn’t use xpg_echo. 


Using printf requires changes to how you write echo statements, but it’s 
defined by POSIX and should be consistent across any POSIX shell 
anywhere. Specifically, you have to write printf "%b" instead of just echo. 
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WARNING 


If you automatically type $b instead of %b you will be unhappy because that 
will print a blank line, since you have specified a null format—that is, unless $b 
is actually defined, in which case the results depend on the value of $b. Either 
way, this can be a very difficult bug to find since $b and %b look very similar: 


$ printf "%b" "Works" 
Works 

$ printf "$b" "Broken" 
$ 


Discussion 


In some shells, the builtin echo behaves differently than the external echo 
used on other systems. This is not always obvious when running on Linux 
since /bin/sh is actually bash (usually; it could also be dash on Ubuntu 
6.10+), and there are similar circumstances on some BSDs. The difference is 
in how echo does or does not expand backslash-escape sequences. Shell 
builtin versions tend not to expand, while external versions (e.g., /bin/echo 
and /usr/bin/echo) tend to expand; but again, that can change from system to 
system. 


Typical Linux (/bin/bash): 


$ type -a echo 
echo is a shell builtin 
echo is /bin/echo 


$ builtin echo "one\ttwo\nthree" 
one\ttwo\nthree\n 


$ /bin/echo "one\ttwo\nthree" 
one\ttwo\nthree\n 


$ echo -e "one\ttwo\nthree" 


one > two 
three 
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$ /bin/echo -e "one\ttwo\nthree" 
one > two 

three 

$ shopt -s xpg_echo 

$ builtin echo "one\ttwo\nthree" 
one > two 

three 


$ shopt -u xpg_echo 


$ builtin echo "one\ttwo\nthree" 
one\ttwo\nthree\n 


Typical BSD (/bin/csh, then /bin/sh): 


S which echo 
echo: shell builtin command. 


$ echo "one\ttwo\nthree" 
one\ttwo\nthree\n 


$ /bin/echo "one\ttwo\nthree" 
one\ttwo\nthree\n 


$ echo -e "one\ttwo\nthree" 
-e one\ttwo\nthree\n 


$ /bin/echo -e "one\ttwo\nthree" 
-e one\ttwo\nthree\n 


$ printf "%b" "one\ttwo\nthree" 
one > two 


$ /bin/sh 


$ echo "one\ttwo\nthree" 
one\ttwo\nthree\n 


$ echo -e "one\ttwo\nthree" 
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one > two 
three 


$ printf "%b" "one\ttwo\nthree" 
one > two 
three 


Solaris 10 (/bin/sh): 


$ which echo 
/usr/bin/echo 


$ type echo 
echo is a shell builtin 


$ echo "one\ttwo\nthree" 
one > two 
three 


$ echo -e "one\ttwo\nthree" 
-e one > two 
three 


S printf "%b" "one\ttwo\nthree" 
one > two 
three 


See Also 
= help printf 


= man 1 printf 

= /Attp://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html 
a /ttp://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf. html 
= Recipe 2.3, “Writing Output with More Formatting Control” 

= Recipe 2.4, “Writing Output Without the Newline” 

= Recipe 15.1, “Finding bash Portably for #!” 
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= Recipe 15.3, “Developing Portable Shell Scripts” 
= Recipe 19.11, “Seeing Odd Behavior from printf” 
a “printf in Appendix A 


15.7 Splitting Output Only When Necessary 


Problem 


You want to split output only if the input exceeds your limit, but the split 
command always creates at least one new file. 


Solution 


Example 15-1 illustrates a way to break up input into fixed sizes only if the 
input exceeds the size limit. 


Example 15-1. ch15/func_split 


# cookbook filename: func_spLit 
HEEEEEEELEEEELEEEELELEEEEEEEEEEELEEEEEEESEEEPELEEEFEEEEEEEEEEEE TEEPE EP EEE EHS 


# Output fixed-size pieces of input ONLY if the Limit is exceeded 
# Called like: Split <file> <prefix> <Limit option> <limit argument> 
# e.g. Split $output ${output} --lines 100 
# See spLlit(1) and wc(1) for option details 
function Split { 
local file=$1 
local prefix=$2 
local Limit_type=$3 
local Limit_size=$4 
local wc_option 


# Sanity checks 

if [ -z "$file" ]; then 
printf "%b" "Split: requires a file name!\n" 
return 1 

fi 

if [ -z "$prefix" ]; then 
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printf "%b" "Split: requires an output file prefix!\n" 
return 1 
fi 
if [ -z "Slimit_type" ]; then 
printf "%b" \ 
"Split: requires a limit option (e.g. --lines), see 'man 
split'!\n" 
return 1 
fi 
if [ -z "Slimit_size" ]; then 
printf "%b" "Split: requires a limit size (e.g. 100), see 'man 
split i\n" 
return 1 
fi 


# Convert split options to wc options. Sigh. 
# Not all options supported by all wc/splits on all systems 
case $limit_type in 


-b| --bytes) wc_option='-c';; 

-C|--line-bytes) wc_option='-L';; 

-l|--lines) wc_option='-L';; 
esac 


# If whatever limit is exceeded 
if [ "S(wc Swc_option $file | awk '{print $1}')" -gt Slimit_size ]; 
then 
# Actually do something 
split --verbose S$limit_type $limit_size $file $prefix 
fi 
} # end of function Split 


Discussion 


Depending on your system, some options (e.g., -C) may not be available in 
split or wc. 


See Also 


m Recipe 8.13, “Counting Lines, Words, or Characters in a File” 
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15.8 Viewing Output in Hex 


Problem 


You need to see output in hex mode to verify that a certain whitespace or 
unprintable character is as expected. 


Solution 


Pipe the output though hexdump using the -C option for canonical output: 


$ hexdump -C filename 

00000000 4c 69 6e 65 20 31 Oa 4c 69 6e 65 20 32 Oa Oa 4c |Line 1.Line 
2..L] 

00000010 69 6e 65 20 34 Oa 4c 69 6e 65 20 35 Oa Oa Jine 4.Line 
Dral 

0000001e 

$ 


For example, nl uses spaces (ASCII 20), then the line number, then a tab 
(ASCII 09) in its output: 


$ nl -ba filename | hexdump -C 

00000000 20 20 20 20 20 31 09 4c 69 6e 65 20 31 Oa 20 20 
1. | 

00000010 20 20 20 32 09 4c 69 6e 65 20 32 Oa 20 20 20 20 
2. | 

00000020 20 33 09 Oa 20 20 20 20 20 34 09 4c 69 6e 65 20 | 3.. 
4.Line | 

00000030 34 Oa 20 20 20 20 20 35 09 4c 69 6e 65 20 35 Oa |4. 

5.Line 5.| 

00000040 20 20 20 20 20 36 09 Oa | 6..| 
00000048 

$ 


1.Line 


2.Line 


Discussion 


hexdump is a BSD utility that also comes with many Linux distributions. 
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Other systems, notably Solaris, do not have it by default. You can use the 


octal dump command od, but its output is only one format at a time, and its 


addresses (lefthand column are in octal, not hex: 


$ nl -ba filename | od 


-X 


0000000 2020 2020 3120 4c09 6e69 2065 0a31 2020 

0000020 2020 3220 4c09 6e69 2065 0a32 2020 2020 

0000040 3320 0a09 2020 2020 3420 4c09 6e69 2065 

0000060 0a34 2020 2020 3520 4c09 6e69 2065 0a35 

0000100 2020 2020 3620 0a09 

0000110 

$ nl -ba filename | od -tx1 

0000000 20 20 20 20 20 31 09 4c 69 6e 65 20 31 Oa 20 20 
0000020 20 20 20 32 09 4c 69 6e 65 20 32 Oa 20 20 20 20 
0000040 20 33 09 Oa 20 20 20 20 20 34 09 4c 69 6e 65 20 
0000060 34 Oa 20 20 20 20 20 35 09 4c 69 6e 65 20 35 Oa 
0000100 20 20 20 20 20 36 09 Oa 

0000110 

$ nl -ba filename | od -c 

0000000 1 \t i n e 
0000020 2 \t L i n e 2 \n 
0000040 3 \t \n 4 \t L 
0000060 4 \n 5 \t L i n 
0000100 6 \t \n 


0000110 


There is also a simple Perl script that might work: 


$ ./hexdump.pl filename 


/0 /1 /2 /3 /4 [5 /6 /7 [8 /9/ A /B /C /D /E /F 


0123456789ABCDEF 
0000 
2..L 
0010 : 


See Also 
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69 6E 65 20 34 OA 4C 69 6E 65 20 35 OA OA 


: 4C 69 6E 65 20 31 OA 4C 69 6E 65 20 32 OA OA 4C Line 1.Line 


\n 


ine 4.Line 5.. 


= man hexdump 

m man od 

a /Attp://www.khngai.com/perl/bin/hexdump. txt 

a Attp://gnuwin32.sourceforge.net/packages/hextools.htm 
a “Table of ASCH Values” in Appendix A 


15.9 Using bash Net-Redirection 


Problem 


You need to send or receive very simple network traffic but you do not have 
a tool such as Netcat installed. 


Solution 


If you have bash version 2.04+ compiled with - -enable-net-redirections 
(default), you can use bash itself. The following example is also used in 
Recipe 15.10: 


$ exec 3<> /dev/tcp/checkip.dyndns.org/80 
$ echo -e "GET / HTTP/1.0\n" >&3 

$ cat <&3 

HTTP/1.1 200 OK 

Content-Type: text/html 

Server: DynDNS-CheckIP/1.0 

Connection: close 

Cache-Control: no-cache 

Pragma: no-cache 

Content-Length: 105 


<htmL><head><title>Current IP Check</title></head> 
<body>Current IP Address: 72.NN.NN.225</body></html> 


$ exec 3<> /dev/tcp/checkip.dyndns.org/80 


$ echo -e "GET / HTTP/1.0\n" >&3 
$ egrep --only-matching 'Current IP Address: [0-9.]+' <&3 
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Current IP Address: 72.NN.NN.225 
$ 


| WARNING 


Debian and derivatives such as Ubuntu expressly compiled with - -disable- 
net-redirections until bash version 4, so this recipe will not work on those 
versions. 


Discussion 


As noted in Recipe 15.12, it is possible to use exec to permanently redirect 
file handles within the current shell session, so the first command sets up 
input and output on file handle 3. The second line sends a trivial command to 
a path on the web server defined in the first command. Note that the user 
agent will appear as “-” on the web server side, which is what is causing the 
“flagged User Agent” warning. The third command simply displays the 
results. 


Both TCP and UDP are supported. Here is a trivial way to send syslog 


messages to a remote server (although in production we recommend using the 
logger utility, which is much more user-friendly and robust): 


echo "<133>${0##*/}[$$]: Test syslog message from bash" \ 
> /dev/udp/loghost.example.com/514 


Since UDP is connectionless, this is actually much easier to use than the 
previous TCP example. <133> is the syslog priority value for local0.notice, 
calculated according to RFC 3164. See section 4.1.1 of the RFC and the 
logger manpage for details. $0 is the name, so ${Q##*/} is the “basename” 
and $$ is the process ID of the current program. The name will be -bash for 
a login shell. 


See Also 
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m man Logger 

=» RFC 3164 

= Recipe 15.10, “Finding My IP Address” 

m Recipe 15.12, “Redirecting Output for the Life of a Script” 
m Recipe 15.14, “Logging to syslog from Your Script” 

= The bash documentation 


a /Attps://bugs.launchpad.net/ubuntu/+source/bash/+bug/2 15034 


15.10 Finding My IP Address 


Problem 


You need to know the IP address of the machine you are running on. 


Solution 


There is no good way to do this that will work on all systems in all situations, 
so we will present several possible solutions. 


First, you can parse output from ifconfig to look for IP addresses. The 
commands in Example 15-2 will either return the first IP address that is not a 
loopback or nothing if there are no interfaces configured or up. 


Example 15-2. ch15/finding_ipas 
# cookbook filename: finding_ipas 
# IPv4 Using awk, cut, and head 


$ /sbin/ifconfig -a | awk '/(cast)/ { print $2 }' | cut -d':' -f2 | head 
-1 


# IPv4 Using Perl, just for fun 

$ /sbin/ifconfig -a | perl -ne 'if ( m/4\s*inet (?:addr:)?([\d. ]+).*? 
cast/ ) 

> { print qq($1\n); exit 0; }' 
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# IPv6 Using awk, cut, and head 
$ /sbin/ifconfig -a | egrep 'inet6 addr: |address: ' | cut -d':' -f2- \ 
| cut -d'/' -f1 | head -1 | tr -d ' ' 


# IPv6 Using Perl, just for fun 

$ /sbin/ifconfig -a | perl -ne ‘if 

> ( m/4\s*(?:inet6)? \s*addr(?:ess)?: ([0-9A-Fa-f:]+)/ ) { print 
qq($1\n); 

> exit 0; }' 

Second, you can get your hostname and resolve it back to an IP address. This 
is often unreliable because today’s systems (especially workstations) might 
have incomplete or incorrect hostnames and/or might be on a dynamic 
network that lacks proper reverse lookup. Use at your own risk and test well: 


host $(hostname) 


Third, you may be more interested in your host’s external, routable address 
than its internal RFC 1918 address. In that case you can use an external host 
such as http://whatismyip.akamai.com, http://checkip.amazonaws.com/, 
http://ipinfo.io/, or others to learn the address of your firewall or NAT device. 
The catch here is that non-Linux systems often have no command-line tool 
like wget installed by default. lynx or curl will also work, but they aren’t 
usually installed by default either (although macOS 10.4+ has curl). Note the 
IP address and other information is deliberately obscured in the following 
examples: 


$ wget -q0 - http://ipinfo.io/ 


{ 
"ip": "8.8.8.8", 
"hostname": "google-public-dns-a.google.com", 
"city": "Mountain View", 
"region": "California", 
"country": "US", 
"Loc": "37.3860,-122.0840", 
"org": "AS15169 Google Inc.", 
"postal": "94035", 
"phone": "650" 

} 
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$ wget -q0 - http://ipinfo.io/ip/ 
72.NN.NN.225 


$ lynx -dump http://ipinfo.io/ip/ 
72.NN.NN.225 


$ curl whatismyip.akamai.com 
72.NN.NN.225 


$ curl http://checkip.amazonaws.com 
72.NN.NN.225 


If you do not have any of the programs used here, but you do have bash 
version 2.04+ compiled with --enable-net-redirections (it isn’t 
compiled this way prior to bash 4 in Debian and derivatives), you can use 
bash itself (see Recipe 15.9 for details). 


$ exec 3<> /dev/tcp/checkip.dyndns.org/80 
$ echo -e "GET / HTTP/1.0\n" >&3 

$ cat <&3 

HTTP/1.1 200 OK 

Content-Type: text/html 

Server: DynDNS-CheckIP/1.0 

Connection: close 

Cache-Control: no-cache 

Pragma: no-cache 

Content-Length: 105 


<html><head><title>Current IP Check</title></head> 
<body>Current IP Address: 96.245.41.129</body></html> 


$ exec 3<> /dev/tcp/checkip.dyndns.org/80 

$ echo -e "GET / HTTP/1.0\n" >&3 

$ egrep --only-matching 'Current IP Address: [0-9.]+' <&3 
Current IP Address: 72.NN.NN.225 

$ 


Discussion 
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The awk and Perl code in the first solution is interesting because of the 
operating system variations we will note here. But it turns out that the lines 
we’re interested in all contain either Bcast or broadcast (or inet6addr: or 
address:, so once we get those lines it’s just a matter of parsing out the 

field we want. Of course Linux makes that harder by using a different format, 
but we’ve dealt with that too. 


Not all systems require the path (if you aren’t root or -a argument to 
ifconfig, but all accept it, so it’s best to use /sbin/ifconfig -a and be done 
with it. 


Here are some ifconfig output examples from different machines: 


# Linux 
$ /sbin/ifconfig 
ethd Link encap:Ethernet HWaddr 00:C0:9F:0B:8F:F6 


inet addr:192.168.99.11 Bcast:192.168.99.255 

Mask: 255.255.255.0 
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 
RX packets:33073511 errors:0 dropped:0 overruns:0 frame:827 
TX packets:52865023 errors:@ dropped:0 overruns:1 carrier:7 
collisions:12922745 txqueuelen: 100 
RX bytes:2224430163 (2121.3 Mb) TX bytes:51266497 (48.8 Mb) 
Interrupt:11 Base address :0xd000 


lo Link encap:Local Loopback 
inet addr:127.0.0.1 Mask:255.0.0.0 
UP LOOPBACK RUNNING MTU:16436 Metric:1 
RX packets:659102 errors:0 dropped:0 overruns:0 frame:0 
TX packets:659102 errors:0 dropped:0 overruns:0 carrier:0 
collisions:0 txqueuelen:0 
RX bytes:89603190 (85.4 Mb) TX bytes:89603190 (85.4 Mb) 


$ /sbin/ifconfig 
ethd Link encap:Ethernet HWaddr 00:06:29:33:4D:42 
inet addr:192.168.99.144 Bcast:192.168.99.255 
Mask: 255.255.255.0 
inet6 addr: fe80::206:29ff:fe33:4d42/64 Scope:Link 
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 
RX packets:1246774 errors:14 dropped:0 overruns:0 frame:14 
TX packets:1063160 errors:0 dropped: overruns:@ carrier:5 
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collisions:65476 txqueuelen:1000 
RX bytes:731714472 (697.8 MiB) TX bytes:942695735 (899.0 MiB) 


lo Link encap:Local Loopback 
inet addr:127.0.0.1 Mask:255.0.0.0 
inet6 addr: ::1/128 Scope:Host 
UP LOOPBACK RUNNING MTU:16436 Metric:1 
RX packets:144664 errors:0 dropped:0 overruns:0 frame:0 
TX packets:144664 errors:0 dropped:0 overruns:0 carrier:0 
collisions:0 txqueuelen:0 
RX bytes:152181602 (145.1 MiB) TX bytes:152181602 (145.1 MiB) 


sito Link encap:IPv6-in-IPv4 
inet6 addr: ::127.0.0.1/96 Scope:Unknown 
UP RUNNING NOARP MTU:1480 Metric:1 
RX packets:0 errors:0 dropped:0 overruns:0 frame:0 
TX packets:0 errors:101910 dropped:0 overruns:0 carrier:0 
collisions:0 txqueuelen:0 
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) 


# NetBSD 
$ /sbin/ifconfig -a 
pcnO: flags=8843<UP ,BROADCAST ,RUNNING,SIMPLEX,MULTICAST> mtu 1500 
address: 00:0c:29:31:eb:19 
media: Ethernet autoselect (autoselect) 
inet 192.168.99.56 netmask Oxffffff00 broadcast 192.168.99.255 
inet6 fe80: :20c:29ff:fe31:eb19%pcnO prefixlen 64 scopeid 0x1 
lo0: flags=8009<UP,LOOPBACK,MULTICAST> mtu 33196 
inet 127.0.0.1 netmask Oxff000000 
inet6 ::1 prefixlen 128 
inet6 fe80::1%loO prefixlen 64 scopeid 0x2 
pppO: flags=8010<POINTOPOINT ,MULTICAST> mtu 1500 
pppi: flags=8010<POINTOPOINT ,MULTICAST> mtu 1500 
slO: flags=c010<POINTOPOINT,LINK2,MULTICAST> mtu 296 
sli: flags=c010<POINTOPOINT,LINK2,MULTICAST> mtu 296 
stripO: flags=0 mtu 1100 
strip1: flags=0 mtu 1100 


# OpenBSD, FreeBSD 

$ /sbin/ifconfig 

lo@: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33224 
inet 127.0.0.1 netmask Oxff000000 
inet6 ::1 prefixlen 128 


525 


inet6 fe80::1%loO prefixlen 64 scopeid 0x5 
le1: flags=8863<UP,BROADCAST ,NOTRAILERS ,RUNNING, SIMPLEX ,MULTICAST> mtu 
1500 
address: 00:0c:29:25:df:00 
inet6 fe80: :20c:29ff:fe25:df00%le1 prefixlen 64 scopeid 0x1 
inet 192.168.99.193 netmask Oxffffff00 broadcast 192.168.99.255 
pflog0: flags=0<> mtu 33224 
pfsync0O: flags=0<> mtu 2020 


# Solaris 
$ /sbin/ifconfig -a 
lo0: flags=1000849<UP , LOOPBACK ‚RUNNING, MULTICAST ,IPv4> mtu 8232 index 1 
inet 127.0.0.1 netmask ff000000 
pcnO: flags=1004843<UP ,BROADCAST ,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 
index 2 
inet 192.168.99.159 netmask ffffff00 broadcast 192.168.99.255 


# Mac 

$ /sbin/ifconfig 

lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 
inet 127.0.0.1 netmask Oxff000000 
inet6 ::1 prefixlen 128 
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 

gif0: flags=8010<POINTOPOINT ,MULTICAST> mtu 1280 

stf0: flags=0<> mtu 1280 


enO: flags=8863<UP,BROADCAST , SMART ,RUNNING,SIMPLEX,MULTICAST> mtu 1500 
inet6 fe80::20d:93ff:fe65:f720%en0 prefixlen 64 scopeid 0x4 
inet 192.168.99.155 netmask Oxffffff00 broadcast 192.168.99.255 
ether 00:0d:93:65:f7:20 
media: autoselect (100baseTX <half-duplex>) status: active 
supported media: none autoselect 10baseT/UTP <half-duplex> 

10baseT/UTP <full- 

duplex>10baseT/UTP <full-duplex,hw-loopback> 100baseTX <half-duplex> 

100baseTX 

<full-duplex> 100baseTX <full-duplex,hw-loopback> 

fw0O: flags=8863<UP ,BROADCAST ,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 2030 

Lladdr 00:0d:93:ff:fe:65:f7:20 
media: autoselect <full-duplex> status: inactive 
supported media: autoselect <full-duplex> 


See Also 
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m man awk 

m man curl 

m man cut 

m man head 

m man lynx 

= man perl 

m man wget 

= /http://checkip.amazonaws.com 
€ /Attp://checkip.dyndns.org 

= http://whatismyip.akamai.com 
a /Attp://ipinfo.io 


a /ttp://xmodulo.com/how-to-find-the-public-ip-address-from-command- 
line.html 


= /Attp://www.faqs.org/rfcs/rfc1918.html 
= Recipe 15.9, “Using bash Net-Redirection” 
m Recipe 15.12, “Redirecting Output for the Life of a Script” 


15.11 Getting Input from Another Machine 


Problem 


Your script needs to get input from another machine, perhaps to check if a 
file exists or a process is running. 


Solution 


Use SSH with public keys and command substitution. To do this, set up SSH 
so that you do not need a password, as described in Recipe 14.21. Next, tailor 
the command that SSH runs to output exactly what your script needs as input. 
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Then simply use command substitution (see Example 15-3). 


Example 15-3. ch15/command_substitution 


#!/usr/bin/env bash 
# cookbook filename: command_substitution 


REMOTE_HOST='host.example.com' # Required 


REMOTE_FILE='/etc/passwd' # Required 

SSH_USER='user@' # Optional, set to '' to not use 
#SSH_ID='-i ~/.ssh/foo.id' # Optional, set to '' to not use 
SSH_ID="' 

result=$( 


ssh $SSH_ID $SSH_USERSREMOTE_HOST \ 
"[ -r SREMOTE_FILE ] && echo 1 || echo 0" 
) || { echo "SSH command failed!" >&2; exit 1; } 


if [ $result = 1 ]; then 
echo "SREMOTE_FILE present on SREMOTE_HOST" 


else 

echo "SREMOTE_FILE not present on SREMOTE_HOST" 
fi 
Discussion 


We do a few interesting things here. First, notice how both $SSH_USER and 
$SSH_ID work. They have an effect when they have a value, but when they 
are empty they interpolate to the empty set and are ignored. This allows us to 
abstract the values in the code, which lends itself to putting those values in a 
configuration file, putting the code into a function, or both: 


# Interpolated line of the variables have values: 
ssh -i ~/.ssh/foo.id user@host.example.com [... | 


# No values: 
ssh host.example.com [... ] 


Next, we set up the command that SSH runs so that there is always output (0 
or 1), then check that $result is not empty. That’s one way to make sure that 
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the SSH command runs (see also Recipe 4.4. If $result is empty, we group 
commands using a {} code block to issue an error message and exit. But 
since we’re always getting output from the SSH command, we have to test 
the value; we can’t just use if [$result]; then. 


If we didn’t use the code block, we’d only issue the warning if the SSH 
command returned an empty $result, but we’d always exit. Read the code 
again until you understand why, because this is an easy way to get bitten. 
Likewise, if we’d tried to use a () subshell instead of the {} code block, our 
intent would fail because the exit 1 would exit the subshell, not the script. 
The script would then continue even after the SSH command had failed—but 
the code would look almost correct, so this might be tricky to debug. 


We could have written the last test case as follows: 


[ $result = 1 ] && echo "SREMOTE_FILE present on SREMOTE_HOST" \ 
|| echo "SREMOTE_FILE not present on SREMOTE_HOST" 


Which form to use depends on your style and the number of statements to 
execute in each situation. In this case it doesn’t matter. 


Finally, we’ve also been careful about formatting so that no lines are too 
long, but the code is still readable and our intent is clear. 


See Also 

= Recipe 2.14, “Saving or Grouping Output from Several Commands” 
= Recipe 4.4, “Telling Whether a Command Succeeded or Not” 

m Recipe 14.21, “Using SSH Without a Password” 


= Recipe 17.20, “Grepping ps Output Without Also Getting the grep Process 
Itself” 


= Recipe 17.21, “Finding Out Whether a Process Is Running” 


15.12 Redirecting Output for the Lite of a 
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Scripc 


Problem 


You’d like to redirect output for an entire script, and you’d rather not have to 
edit every echo or printf statement. 


Solution 
Use a little-known feature of the exec command to redirect STDOUT or 
STDERR: 


# Optional, save the "old" STDERR 
exec 3>&2 


# Redirect any output to STDERR to an error logfile instead 
exec 2> /path/to/error_log 


# Script with "globally" redirected STDERR goes here 


# Turn off redirect by reverting STDERR and closing FH3 
exec 2>&3- 


Discussion 


Usually exec replaces the running shell with the command supplied in its 
arguments, destroying the original shell. However, if no command is given, it 
can manipulate redirection in the current shell. You are not limited to 
redirecting STDOUT or STDERR, but they are the most common targets for 
redirection in this case. 


See Also 


m help exec 


= Recipe 15.9, “Using bash Net-Redirection” 
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15.13 Working Around “Argument list too 
long” Errors 


Problem 


You get an “Argument list too long” error while trying to do an operation 
involving shell wildcard expansion. 


Solution 


Use the xargs command, possibly in conjunction with find, to break up your 
argument list. 


For simple cases, just use a for loop or find instead of Js: 


$ ls /path/with/many/many/files/*e* 
-/bin/bash: /bin/ls: Argument list too long 


# Short demo, surrounding ~ are for illustration only 
$ for i in ./some_files/*e*; do echo "~$Si~"; done 
~./some_files/A file with (parens)~ 
~./some_files/A file with [brackets ]~ 
~./some_files/File with embedded 

newLine~ 

~./some_files/file with = sign~ 
~./some_files/file with spaces~ 
~./some_files/file with |~ 

~./some_files/file with:~ 

~./some_files/file with;~ 
~./some_files/regular_file~ 


$ find ./some_files -name '*e*' -exec echo ~{}~ \; 
~./some_files~ 

~./some_files/A file with [brackets ]~ 
~./some_files/A file with (parens)~ 
~./some_files/regular_file~ 

~./some_files/file with spaces~ 
~./some_files/file with = sign~ 
~./some_files/File with embedded 
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newLline~ 

~./some_files/file with;~ 
~./some_files/file with:~ 
~./some_files/file with |~ 


$ for i in /path/with/many/many/files/*e*; do echo "Si"; done 
[This works, but the output is too long to List] 


$ find /path/with/many/many/files/ -name '*e*' 
[This works, but the output is too long to List] 


This example works correctly with the echo command, but when you feed 
that "$i" into other programs, especially other shell constructs, $IFS and 
other parsing may come into play. The GNU find and xargs take that into 
account with find - print and xargs -0. (No, we don’t know why it’s - 
printo and -0 instead of being consistent.) These arguments cause find to 
use the null character (which can’t appear in a filename) instead of 
whitespace as an output record separator, and xargs to use null as its input 
record separator. That will correctly parse files containing odd characters: 


find /path/with/many/many/files/ -name '*e*' -printO | xargs -0 proggy 


Discussion 


Note that the default behavior of bash (and sh) is to return unmatched 
patterns unchanged. That means you could end up with your for loop setting 
$i to ./some_files/*e* if no files match the wildcard pattern. You can set 
the shopt -s nullglob option to cause filename patterns that match no files 
to expand to a null string, rather than expanding to themselves. 


You might assume that the for loop solution in the simple case would run 
into the same problem as the /s command, but it doesn’t. Chet Ramey tells us: 


ARG_MAX bounds the total space requirement of the exec* family of system 
calls, so the kernel knows the largest buffer it will have to allocate. This is 
all three arguments to execve: program name, argument vector, and 
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environment. 


The [ls command] fails because the total bytes taken up by the arguments 
to execve exceeds ARG_MAX. The [for loop] succeeds because everything is 
done internally: though the entire list is generated and stored, execve is 
never called. 


Be careful that find doesn’t find too many files, since it will recursively 
descend into all subdirectories by default while /s will not. Some versions of 


find have a -maxdepth option to control how deep it goes. Using the for loop 
may be easier. 


Use the getconf ARG_MAX command to see what the limit is on your system. 


It varies wildly (see also getconf LINE_MAX;). Table 15-1 lists some 
examples. 


TIP 


Per the GNU Core Utilities FAQ, Linux 2.6.23+ removes this limit, though it 
may still be reported, or it may not yet be removed in your particular 
distribution’s kernel. 


Table 15-1. System limits 


System ARG MAX limits (bytes) 
HP-UX 11 2,048,000 
Solaris (8, 9, 10) 1,048,320 


NetBSD 2.0.2, OpenBSD 3.7, macOS 262,144 
Linux (Red Hat, Debian, Ubuntu) 131,072 


FreeBSD 5.4 65,536 


See Also 
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= Question 19 in the GNU Core Utilities FAQ 
= Recipe 9.2, “Handling Filenames Containing Odd Characters” 


15.14 Logging to syslog from Your Script 


Problem 
You’d like your script to be able to log to syslog. 


Solution 
Use logger, Netcat, or bash’s built-in network redirection features. 


logger is installed by default on most systems and is an easy way to send 
messages to the local syslog service: 


logger -p localO.notice -t ${O##*/}[$$] test message 


However, it does not send sys/og to remote hosts by itself. If you need to do 
that, you can use bash: 


echo "<133>${0##*/}[$$]: Test syslog message from bash" \ 
> /dev/udp/loghost.example.com/514 


or Netcat: 


echo "<133>${0##*/}[$$]: Test syslog message from Netcat" | nc -w1 -u 
loghost 514 


Netcat is known as the “TCP/IP Swiss Army knife” and is usually not 
installed by default. It may also be prohibited as a hacking tool by some 
security policies, though bash’s net-redirection features do pretty much the 
same thing. See the discussion in Recipe 15.9 for details on the 


<133>${0##*/}[$$] part. 
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Discussion 


logger and Netcat have many more features than we include here. See the 
respective manpages for details. 


See Also 


m man Logger 
m man nc 


= Recipe 15.9, “Using bash Net-Redirection” 


15. 15 Using logger Correctly 


Problem 


You want to use the /ogger tool so your script can send sys/og messages, but 
the defaults do not provide enough useful information. 


Solution 


Use logger as follows: 


logger -t "${0##*/}[$$]" 'Your message here' 


Discussion 


In our opinion, failing to use the -t option to /ogger should at least trigger a 
warning, if not a fatal error. The t is for “tag,” and as the manpage says it 
will “[mark] every line to be logged with the specified tag.” In other words, 
without -t you will have a hard time telling where your message came from! 


The tag of ${0##*/}[$$] may look like gibberish, but it’s actually what you 
usually see when you look at sys/og lines. It is just the basename of your 
script and the process ID ($$) in square brackets. Compare the command with 


535 


and without the -t option: 


$ logger -t "S{O##*/}[$$]" "Your message here' 

$ tail -1 /var/log/syslog 

Oct 26 12:16:01 hostname yourscript[977]: Your message here 
$ logger 'Your message here' 

$ tail -1 /var/log/syslog 

Oct 26 12:16:01 hostname Your message here 

$ 


logger has other interesting options and it’s well worth reading the manpage, 
but be aware that some options may vary by age, version, and distribution, so 
you need to consider that if your script will run in the wild. For example, 


CentOS 5 and 6 versions of logger do not have the very useful -n option that 
the Debian/Ubuntu version has: 


-n, --server server 

Write to the specified remote syslog server using UDP instead 
of 

to the builtin syslog routines. 


See Also 


= Recipe 5.20, “Using bash for basename” 
= Recipe 11.10, “Logging with Dates” 
= Recipe 15.14, “Logging to syslog from Your Script” 


m man Logger 


15.16 Sending Email from Your Script 


Problem 


You’d like your script to be able to send email, optionally with attachments. 
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Solution 


These solutions depend on your system having a compatible mailer (such as 
mail, mailx, or mailto, a message transfer agent (MTA being installed and 
running, and proper configuration of your email environment. Unfortunately, 
you can’t always count on all of that, so these solutions must be well tested in 
your intended environment. 


The first way to send mail from your script is to write some code to generate 
and send a message, as follows: 


# Simple 
cat email_body | \ 
mail -s "Message subject" recipienti@example.com recipient2@example.com 


or: 


# Attachment only 
uuencode /path/to/attachment_file attachment_name | \ 
mail -s "Message Subject" recipienti@example.com recipient2@example.com 


or: 


# Attachment and body 
(cat email_body ; uuencode /path/to/attachment_file attachment_name) | 


\ 


mail -s "Message Subject" recipienti@example.com recipient2@example.com 


In practice, it’s not always that easy. For one thing, while uuencode will 
probably be there, mail and friends may or may not, and their capabilities 
may vary. In some cases mail and mailx are even the same program, hard- or 
soft-linked together. In production, you will want to use some abstraction to 
allow for portability. For example, mail works on Linux and the BSDs, but 
mailx is required for Solaris since its mail lacks support for -s. mailx works 
on some Linux distributions (e.g., Debian), but not others. We choose the 
mailer based on hostname in Example 15-4, but depending on your 
environment using uname -o might make more sense. 
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Example 15-4. ch15/email_sample 
# cookbook filename: email_sample 
# Define some mail settings. Use a case statement with uname or hostname 


# to tweak settings as required for your environment. 
case SHOSTNAME in 


* company.com ) MAILER='mail' 33 # Linux and BSD 

host1.* ) MAILER='mailx' ;; # Solaris, BSD, and some 
Linuxes 

host2.* ) MAILER='mailto' ;; # Handy, if installed 
esac 


RECIPIENTS='recipienti@example.com recipient2@example.com' 
SUBJECT="Data from $0" 


ee 

# Create the body as a file or variable using echo, printf, or a here- 
document 

# Create or modify SUBJECT and/or SRECIPIENTS as needed 


a 


( echo Semail_body ; uuencode Sattachment $(basename Sattachment) ) \ 
| SMAILER -s "SSUBJECT" "SRECIPIENTS" 


We should also note that sending attachments in this way depends somewhat 
on the client you use to read the resulting message. Modern clients like 
Thunderbird (and Outlook) will detect a uuencoded message and present it as 
an attachment. Other clients may not. You can always save the message and 
uudecode it (uudecode is smart enough to skip the message part and just 
handle the attachment part), but that’s a pain. 


The second way to send mail from your scripts is to outsource the task to 
cron. While the exact feature set of cron varies from system to system, one 
thing in common is that any output from a cron job is mailed to the job’s 
owner or the user defined using the MAILTO variable. You can take advantage 
of that fact to get emailing for free, assuming that your email infrastructure 
works. 


The proper way to design a script intended to run from cron (and, many 
would argue, any script or Unix tool at all) is to make it silent unless it 
encounters a warning or error. If necessary, use a -v argument to optionally 
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allow a more verbose mode, but don’t run it that way from cron, at least after 
you’ve finished testing. The reason for this is as noted: cron emails you all 
the output. If you get an email message from cron every time your script 
runs, you'll soon start ignoring them. But if your script is silent except when 
there’s a problem, you'll only get a notification when there is a problem, 
which is ideal. 


Discussion 


Note that mailto is intended to be a multimedia- and MIME-aware update to 
mail, and thus you could avoid using uwuencode for sending attachments, but 
it’s not as widely available as mail or mailx. If all else fails, elm or mutt may 
be used in place of mail, mailx, or mailto, thought they are even less likely to 
be installed by default than mai/*. Also, some versions of these programs 
support a -r option to supply a return address in case you want to supply one. 
mutt also has a -a option that makes sending attachments a breeze: 


cat "Smessage_ body" | mutt -s "$subject" -a "Sattachment_file" 
"recipients" 


mpack is another tool worth looking into, but it is very unlikely to be installed 
by default. Check your system’s software repository or download the source. 
From the manpage: 


The mpack program encodes the named file in one or more MIME 
messages. The resulting messages are mailed to one or more recipients, 
written to a named file or set of files, or posted to a set of newsgroups. 


Another way to handle the various names and locations of mail clients is 
shown in Chapter 8 of Classic Shell Scripting by Nelson H. F. Beebe and 
Arnold Robbins (O’Reilly), reprinted here as Example 15-5. 


Example 15-5. ch15/email_sample_css 


# cookbook filename: email_sample_css 
# From Chapter 8 of Classic Shell Scripting 


for MAIL in /bin/mailx /usr/bin/mailx /usr/sbin/mailx /usr/ucb/mailx 
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/bin/mail \ 
/usr/bin/mail; do 
[ -x SMAIL ] && break 
done 
[ -x MAIL ] || { echo ‘Cannot find a mailer!' >&2; exit 1; } 


uuencode is an old method for translating binary data into ASCII text for 
transmission over links that could not support binary, which is to say most of 
the internet before it became the internet and the web. We have it on good 
authority that at least some such links still remain, but even if you never 
encounter one it’s still useful to be able to convert an attachment into an 
otherwise ASCII medium in such a way that modern mail clients will 
recognize it. See also uudecode and mimeencode. Note that uuencoded files 
are about one-third larger than their binary equivalent, so you probably want 
to compress the file before uuencoding it. 


The problem with email, aside from the differing frontend mail user agent 
(MUA) programs like mail and mailx, is that there are a lot of moving parts 
that must all work together. This is exacerbated by the spam problem: mail 
administrators have had to so severely lock down mail servers that it can 
easily affect your scripts. All we can say here is to fully test your solution, 
and talk to your system and mail administrators if necessary. 


One other problem you might see is that some workstation-oriented Linux 
distributions, such as Ubuntu, don’t install or run an MTA by default since 
they assume you will be using a full-featured GUI client such as Evolution or 
Thunderbird. If that’s the case, command-line MUAs and email from cron 
won’t work either. Consult your distribution’s support groups for help with 
this as needed. 


JUST ENOUGH MTA FOR CRON 


We can make a good argument, for security attack surface, spam and general 
maintainability reasons, that the only servers that should be running full MTAs are 
dedicated mail servers. So how do you send mail from all the other nodes that are not 
mail servers? You install a package like nullmailer for Debian and derivatives or 
SSMTP for Red Hat and derivatives. 


While the configuration and implementation of those packages differ, the idea is the 
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same: “just enough MTA for cron.” We encourage everyone to install one of these or a 
similar package, because it’s amazing how often you can catch mistakes and 
misconfigurations through cron messages. Even if you think you have full monitoring, 
the ability for your nodes to send email is very useful. 


A trivial nullmailer configuration looks like this: 


/etc/nullmailer/adminaddr 
it@example.com 
/etc/nullmailer/defaultdomain 
example.com 
Optional: /etc/nullmailer/pausetime 
3600 
/etc/nullmailer/remotes 
mail.example.com smtp --port=587 
A trivial SSMTP /etc/ssmtp/ssmtp.conf configuration looks like this: 


root=it@exampLe.com 
mai Lhub=mail.exampLle.com:587 


WARNING 


Despite what we just said in the previous tip, you do not want to allow all your 
nodes to send email all over the place! That’s just asking for trouble. We’re 
assuming that you have proper firewall rules, including egress rules that only 
allow email out to the world from dedicated email servers. You also need to log 
those rules and monitor those logs—a node that suddenly starts sending a lot of 
email anywhere definitely needs to be carefully looked at, because it has some 
kind of problem or infection. And that’s not always the kind of thing your 
regular monitoring for CPU use, disk space, etc. is likely to catch. 
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See Also 


= man mail 

m man mailx 

m man mailto 

= man mutt 

m man uuencode 
m man cron 


=m man 5 crontab 


15.17 Automating a Process Using Phases 


Problem 


You have a long job or process you need to automate, but it may require 
manual intervention and you need to be able to restart at various points in the 
progress. You might use a GOTO to jump around, but bash doesn’t have that. 


Solution 


Use a case statement to break your script up into sections or phases. 


First, we'll define a standardized way to get answers from the user using 
Example 15-6, from Recipe 3.6. 


Example 15-6. ch03/func_choice. 1 


# cookbook filename: func_choice.1 


Let the user make a choice about something and return a standardized 
answer. How the default is handled and what happens next is up to 
the if/then after the choice in main. 

Called like: choice <prompt> 

e.g. choice "Do you want to play a game?" 


# HH H # 


542 


# Returns: global variable CHOICE 
function choice { 


CHOICE='' 
local prompt="$*" 
local answer 


read -p "Sprompt" answer 
case "Sanswer" in 
[yY1] ) CHOICE='y';; 
[nNO] ) CHOICE='n';; 
* ) CHOICE="Sanswer";; 
esac 
} # end of function choice 


Then, we’ll set up our phases as shown in Example 15-7. 


Example 15-7. ch15/using_phases 


# cookbook filename: using_phases 


# Main Loop 
until [ "Sphase" = "Finished." ]; do 


case phase in 


phased ) 
ThisPhase=0 
NextPhase="$(( $ThisPhase + 1 ))" 
echo | 7H HHH ' 
echo "PhaseSThisPhase = Initialization of FooBarBaz build" 
# Things that should only be initialized at the beginning of 


a 
# new build cycle go here 

#. 
echo "Phase${ThisPhase}=Ending" 
phase="phase$NextPhase" 

#. 


phase20 ) 


543 


ThisPhase=20 


NextPhase="$(( $ThisPhase + 1 ))" 
echo | RRR ' 
echo "PhaseSThisPhase = Main processing for FooBarBaz build" 


#. 

choice "[PSThisPhase] Do we need to stop and fix anything? 
Ly/N]: " 

if [ "Schoice" = "y" ]; then 

echo "Re-run 'SMYNAME phase${ThisPhase}' after handling 
this." 
exit $ThisPhase 

fl 

echo "Phase${ThisPhase}=Ending" 

phase="phase$NextPhase" 
#. 

=) 

echo "What the heck?!? We should never get HERE! Gonna 
croak!" 

echo "Try $0 -h" 

exit 99 

phase="Finished." 

esac 
printf "%b" "\a" # Ring the bell 

done 
Discussion 


Since exit codes only go up to 255, the exit $ThisPhase line limits you to 
that many phases. And our exit 99 line limits you even more, although that 
one is easily adjusted. If you require more than 254 phases (plus 255 as the 
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error code, you have our sympathy. You can either come up with a different 
exit code scheme, or chain several scripts together. 


You should probably set up a usage and/or summary routine that lists the 
various phases: 


PhaseO = Initialization of FooBarBaz build 
Phase20 = Main processing for FooBarBaz build 
Phase28 ... 


You can probably grep most of the text out of the code with something like 
grep 'PhaseSThisPhase' my_script. 


You may also want to log to a local flat file, sys/og, or some other 
mechanism. In that case, define a function like Logmsg and use it as 
appropriate in the code. It could be as simple as: 


function logmsg { 
# Write a timestamped log message to the screen and logfile 
# Note tee -a to append 
# printf "%b" "S(date '+%Y-%m-%d %H:%M:%S'): $*" | tee -a SLOGFILE 
printf "%(%Y-%m-%d %H:%M:%S)T: %b\n" -1 "$*" | tee -a SLOGFILE 
} # end of function Logmsg 


This function uses the newer printf format that supports time and date values. 
If you are using an older shell (before version 4), switch the printf with the 
commented printf line in this function. 


You may note that this larger script violates our usual standard of being silent 
unless it encounters a problem. Since it is designed to be interactive, we’re 
OK with that. 


See Also 


= Recipe 3.5, “Getting User Input” 
= Recipe 3.6, “Getting Yes or No Input” 
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= Recipe 11.10, “Logging with Dates” 
= Recipe 15.14, “Logging to syslog from Your Script” 


15.18 Doing Two Things at Once 


Problem 


A pipeline of commands goes only one way, each process writing to the next 
in line. Can two processes converse with each other, each reading as its input 
the output of the other command? 


Solution 
Yes! As of version 4 of bash, the coproc command can do just that. 


Example 15-8 is a simple example that uses the bc program, an arbitrary- 
precision calculator language, as a coprocess, allowing bash to send 
calculations to bc and read back the results. It’s one way of giving bash the 
ability to do floating-point calculations, though we’re only using it here as an 
example of the coproc command. 


WARNING 


Note that bash must be compiled with - -enable-coprocesses for this to 
work. That is the default, but some packages may not have it. 


Example 15-8. ch15/fpmath 


# cookbook filename: fpmath 
# using coproc for floating-point math 


# initialize the coprocess 

# call this first 

# before attempting any calls to fpmath 
function fpinit () 

{ 
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coproc /usr/bin/bc 


bcin=${COPROC[ 1] } 

bcout=${COPROC[0 ] } 

echo "scale=5" >& ${bcin} 
} 


# compute with floating-point numbers 
# by sending the args to bc 

# then reading its response 
function fpmath() 


{ 
echo "$@" >& S{bcin} 
if read -t 0.25 -u ${bcout} responz 
then 
echo "Sresponz" 
fi 
} 
#HARNHHHHHHRHHHHRHRHRHRHHHHHH 
# main 
fpinit 


while read aline 
do 
answer=$(fpmath "$aline") 
if [[ -n $answer ]] 
then 
echo $answer 
fi 
done 


Discussion 


For our example we define two functions, fpinit and fpmath. The purpose 
of fpinit is to set up the coprocess. The purpose of fpmath is to get a 
floating-point calculation done by sending the request to the coprocess and 
reading back the result. To demonstrate these functions we wrote a while 
loop that prompts the user for input, then sends that input to the coprocess 
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and reads back the result. 


coproc will execute a command (or pipeline of commands alongside the 
current shell process. In our case we gave it /usr/bin/bc (though a full path 
is not required; the shell will search $PATH as with any command. 
Furthermore, it creates two pipes, one connected to the standard output of the 
command and one connected to its standard input. These connections are 
stored in a shell array called COPROC by default. Index 0 of that array holds 
the output file descriptor of that process; index | holds the input file 
descriptor of that process. 


That may seem backward to a systems programmer, but remember that the 
output of the coprocess can act as the input to the calling process (the shell 
script, and vice versa. To keep their usage clearer we assigned them to 
variables that describe how we will use them. We chose $bcin to hold the 
file descriptor to be used to send input to the bc command and $bcout to hold 
the file descriptor to be used to read its output. 


We use these file descriptors in our fpmath function. To send a calculation to 
the bc process we echo the text of a calculation (for example, "3.4 * 7.52" 
and redirect that output to the input file descriptor. In our example, that 
means that we redirect to bcin. To get the result back from bc we use the 
read command, which has an option (-u that lets us specify the file 
descriptor from which to read. In our case we use $bcout. 


We’ve also used the -t option on the read command. That option sets a 
timeout value after which the read will return, possibly empty-handed. We 
use that here since not every valid command to bc will result in output. (For 
example, "x=5" will store the value 5 in the variable x but will generate no 
output. Versions of bash that are new enough to have the coproc command 
are also new enough to support a fractional value for the timeout value. Older 
versions only allowed integers. 


See Also 


= man bash 
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m help coproc 


15.19 Running an SSH command on multiple hosts 


Problem 


You need to run a Command over SSH on Multiple Hosts. 


Solution 


Wrap your SSH command in a for loop: 


$ for host in host1 host2 host3; do echo -n "On $host, I am: " ; 
> ssh Shost 'whoami' ; done 

On host1, I am: root 

On host2, I am: jp 

On host3, I am: jp 

$ 


Discussion 


This looks very easy, and it is when everything works, but there are a few 
points to keep in mind. 


First, all of the underlying networking and the firewall, DNS, and similar 
aspects have to already be working. 

Second, while not strictly necessary, it’s much more convenient to do this 
when using SSH keys, so you’|l want to read Recipe 14.21. 


Third, you can quickly run into quoting issues in the SSH command. For 
example, consider: 


$ for host in host{1. .3}; 

> do echo "Shost:" ; 

> ssh $host 'grep "SHOSTNAME" /etc/hosts' ; 
> done 
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That’s straightforward—we enclose the ssh command in single quotes so that 
our local bash shell will not interpolate it, and we enclose our grep argument 
in double quotes for clarity, since that is not strictly needed. But what if we 
have some variables that our local bash needs to interpolate and others that 
the remote bash must handle? Or what if we actually need to grep for single 
quotes? 


We can handle those problems by enclosing the ssh command in double 
quotes, then escaping any variables and/or double quotes needed on the 
remote side, but it gets ugly fast: 


$ for host in host{1. .3}; 

> do ssh host "echo \"Local 'Shost' is remote '\SHOSTNAME'\""; 
> done 

Local 'host1' is remote 'host1' 

Local 'host2' is remote 'host2' 

Local 'host3' is remote 'host3' 


$ 


We would like to point out that you can do amazing things in the OpenSSH 
configuration file, and it’s well worth spending some time learning about, but 
unfortunately that is well beyond the scope of this book. 


We should also point out that while this can be a handy technique, you are 
better off learning and using a real configuration management system (CMS) 
for these kinds of tasks. We really like Ansible, but there are many options, 
including at least one written in bash itself: https://github.com/wffls/waffles. 


See Also 


= Recipe 14.21, “Using SSH Without a Password” 
Recipe 14.22, “Restricting SSH Commands” 


m man ssh 
m man ssh_config 


https://github.com/wffls/waffles 
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Chapter 16. Configuring and 
Customizing bash 


Would you want to work in an environment where you couldn’t adjust things 
to your liking? Imagine not being able to adjust the height of your chair, or 
being forced to walk the long way to the lunchroom, just because someone 
else thought that was the “right way.” That sort of inflexibility wouldn’t be 
acceptable for long; however, that’s what most users expect, and accept, from 
their computing environments. But if you’re used to thinking of your user 
interface as something inflexible and unchangeable, relax—the user interface 
is not carved in stone. bash lets you customize it so that it works with you, 
rather than against you. 


bash gives you a very powerful and flexible environment. Part of that 
flexibility is the extent to which it can be customized. If you’re a casual Unix 
user, or if you’re used to a less flexible environment, you might not be aware 
of what’s possible. This chapter shows you how to configure bash to suit 
your individual needs and style. If you think the Unix cat command has a 
ridiculous name (most non-Unix people would agree), you can define an alias 
that renames it. If you use a few commands all the time, you can assign 
abbreviations to them, too—or even misspellings that correspond to your 
favorite typing errors (e.g., “mroe” for the more command). You can create 
your own commands, which can be used the same way as standard Unix 
commands. You can alter the prompt so that it contains useful information 
(like the current directory). And you can alter the way bash behaves; for 
example, you can make it case-insensitive, so that it doesn’t care about the 
difference between upper- and lowercase. You will be surprised and pleased 
at how much you can improve your productivity with a few simple bash 
tweaks, especially to readline. 


For more information about customizing and configuring bash, see Chapter 3 
of Learning the bash Shell, 3rd Edition, by Cameron Newham (O’ Reilly). 
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16. 1 bash Startup Options 


Problem 


You'd like to understand the various options you can use when starting bash, 
but bash --heLp is not helping you. 


Solution 


In addition to bash --help, try bash -c "help set" and bash -c help, 
or just help set and help if you are already running in a bash shell. 


Discussion 


bash sometimes has several different ways to set the same option, and this is 
an example of that. You can set an option on startup (for example, bash -x), 
then later turn the same option off interactively using set +x. 


See Also 


=m Appendix A 
m Recipe 19.12, “Testing bash Script Syntax” 


16.2 Customizing Your Prompt 


Problem 


The default bash prompt is usually something uninformative that ends with $ 
and doesn’t tell you much. You would like to customize it to show 
information you find useful. 


Solution 


Customize the $PS1 and $PS2 variables as you desire. 
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The default prompt varies depending on your system. bash itself will show its 
major and minor version (\s-\v\$; for example, bash-3.00$. However, 
your operating system may have its own default, such as [user@host~]$ 
(L\u@\h\W]\$) for some versions of Fedora. Our solution presents eight 
basic prompts and three fancier prompts. 


Basic prompts 


Here are eight examples of more useful prompts that will work with bash 
1.14.7 or newer. The trailing \$ displays # if the effective UID is zero (i.e., 
you are rootand $ otherwise: 


1. Username@hostname, the date and time, and the current working 
directory: 


$ export PS1='[\u@\h \d \A] \w \$ ' 
[jp@freebsd Wed Dec 28 19:32] ~ $ cd /usr/local/bin/ 
[jp@freebsd Wed Dec 28 19:32] /usr/local/bin $ 


2. Username@long-hostname, the date and time in ISO 8601 format, and 
the base-name of the current working directory (\W): 


$ export PS1='[\u@\H \D{%Y-%m-%d %H:%M:%S%z}] \W \S ' 
[jp@freebsd.jpsdomain.org 2005-12-28 19:33:03-0500] ~ $ cd 
/usr/local/ 

[jp@freebsd.jpsdomain.org 2005-12-28 19:33:06-0500] Local $ 


3. Username@hostname, the bash version, and the current working 
directory (\w): 


$ export PS1='[\u@\h \V \w] \$ ' 
[jp@freebsd 3.00.16] ~ $ cd /usr/local/bin/ 
[jp@freebsd 3.00.16] /usr/local/bin $ 


4. Newline, username@hostname, base PTY, shell level, history number, 
newline, and full working directory name ($PWD): 
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$ export PS1="\n[\u@\h \l:SSHLVL:\!]\n$PWD\$ ' 


[jp@freebsd ttyp0:3:21] 
/home/jp$ cd /usr/local/bin/ 


[jp@freebsd ttyp0:3:22] 
/usr/local/bins 


PTY is the number of the pseudoterminal (in Linux terms) to which you 
are connected. This is useful when you have more than one session and 
are trying to keep track of which is which. Shell level is the depth of 
subshells you are in. When you first log in it’s 1, and as you run 
subprocesses (for example, screen) it increments, so after running 
screen it would normally be 2. The history line is the number of the 
current command in the command history. 


. Username@hostname, the exit status of the last command, and the 
current working directory. Note the exit status will be reset (and thus 
useless) if you execute any commands from the prompt: 


$ export PS1='[\u@\h $? \w \$ ' 
[jp@freebsd 0 ~ $ cd /usr/local/bin/ 
[jp@freebsd © /usr/local/bin $ true 
[jp@freebsd © /usr/local/bin $ false 
[jp@freebsd 1 /usr/local/bin $ true 
[jp@freebsd © /usr/local/bin $ 


. Newline, username@hostname, and the number of jobs the shell is 
currently managing. This can be useful if you run a lot of background 
jobs and forget that they are there: 


$ export PS1='\n[\u@\h jobs:\j]\nSPWD\S ' 
[jp@freebsd jobs:0] 

/tmpS ls -lar /etc > /dev/null & 

[1] 96461 


[jp@freebsd jobs:1] 
/tmp$ 
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[1]+ Exit 1 ls -lar /etc >/dev/null 


[jp@freebsd jobs:0] 
/tmp$ 


This example goes really crazy and shows everything. 


7. Newline, username@hostname, terminal, shell, level, history, number 
of jobs, bash version, and full working directory: 


$ export PS1="\n[\u@\h t:\l L:SSHLVL h:\! j:\j v:\V]\n$PWD\$ ' 


[jp@freebsd t:ttyp1 1:2 h:91 j:0 v:3.00.16] 
/home/jp$ 


8. Newline, username@hostname, T for terminal, L for shell level, C for 
command number, and date and time in ISO 8601 format: 


$ PS1="\n[\u@\h: T\ Ll: LSSHLVL:C\! : \D{%Y-%m-%d_%H:%M:%S_%Z}]\nSPWD\$ 


[ jp@freebsd: Tttyp1:L1:C337:2006-08-13_03:47:11_EDT ] 
/home/jp$ cd /usr/local/bin/ 


[ jp@freebsd: Tttyp1:L1: C338: 2006-08-13 03:47:16 EDT | 
/usr/local/bins 


This prompt is one you will either love or hate. It shows very clearly 
who did what, when, and where and is great for documenting steps you 
took for some task via a simple copy and paste from a scrollback buffer 
—but some people find it much too cluttered and distracting. 


Fancy prompts 


Here are three fancy prompts that use ANSI escape sequences for colors, or 
to set the contents of the title bar in an xterm—but be aware that these will 
not always work. There are a bewildering array of variables in system 
settings, xterm emulation, and SSH and Telnet clients, all of which can affect 
these prompts. 
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Also, note that such escape sequences should be surrounded by \[ and \], 
which tells bash that the enclosed characters are nonprinting. Otherwise, bash 
(technically, really readline will be confused about line lengths and wrap 
lines in the wrong place: 


1. Username@hostname, and the current working directory in light blue 
(color not shown in print): 


$ export PS1='\[\033[1;34m\][\u@\h:\w]\$\[\033[0m\] ' 
[ jp@freebsd:~]$ 

[jp@freebsd:~]$ cd /tmp 

[jp@freebsd:/tmp]$ 


2. Username@hostname, and the current working directory in both the 
xterm title bar and the prompt itself. If you are not running in an xterm 
this may produce garbage in your prompt: 


$ export PS1='\[\033]0;\u@\h: \w\007\][\u@\h:\w]\$s ' 
[ jp@ubuntu:~]$ 

[jp@ubuntu:~]$ cd /tmp 

[ jp@ubuntu: /tmp]$ 


3. Both color and xterm updates: 


$ PS1='\[\033]0;\u@\h: \w\007\]\[\033[1; 34m\ ][\u@\h:\w]\$\ 
[\933[Om\] ' 

[ jp@ubuntu:~]$ 

[jp@ubuntu:~]$ cd /tmp 

[ jp@ubuntu: /tmp]$ 


To save you some tedious typing if you want to try them out, all of 
these prompts are available in the file ./ch16/prompts in this book’s 
GitHub repository. The contents of that file are shown in Example 16-1. 


Example 16-1. ch16/prompts 


# cookbook filename: prompts 


# Username @ short hostname, the date and time, and the current 
working 
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# directory (CWD): 
export PS1='[\u@\h \d \A] \w \s '' 


# Username @ long hostname, the date and time in ISO 8601 format, 
and the 

# basename of the current working directory (\W): 

export PS1='[\u@\H \D{%Y-%m-%d %H:%M:%S%z}] \W \S ' 


# Username @ short hostname, bash version, and the current working 
# directory (\w): 
export PS1='[\u@\h W \w] \$ ' 


# Newline, username @ short hostname, base PTY, shell level, history 
number, 

# newline, and full working directory name (PWD): 

export PS1='\n[\u@\h \L:SSHLVL:\!]\nSPWD\S ' 


# Username @ short hostname, the exit status of the last command, 
and the 

# current working directory: 

export PS1='[\u@\h $? \w \$ ' 


# Newline, username @ short hostname, and the number of jobs 
# in the background: 
export PS1='\n[\u@\h jobs:\j]\nSPWD\s ' 


# Newline, username @ short hostname, terminal, shell, level, 
history, jobs, 

# version and full working directory name: 

export PS1='\n[\u@\h t:\l L:SSHLVL h:\! j:\j v:\V]J\nSPWD\Ss ' 


# Newline, username @ short hostname, T for terminal, L for shell 
level, C 

# command number, and the date and time in ISO 8601 format: 
export PS1='\n[\u@\h:T\L:LSSHLVL:C\ ! : \D{%Y -%m- 
%d_%H:%M:%S_%Z}]\nSPWD\S ' 


# Username @ short hostname, and the current working directory in 
light 

# blue: 

export PS1='\[\033[1;34m\][\u@\h: \w]\$\[\033[0m\] ' 


# Username @ short hostname, and the current working directory in 
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both the 
# xterm title bar and the prompt itself: 
export PS1='\[\033]0;\u@\h: \w\007\][\u@\h:\w]\s ' 


# Both color and xterm updates: 
export PS1='\[\033]0;\u@\h: \w\007\ ]\[\033[1;34m\][\u@\h: \w]\$\ 
[\033[Om\] ' 


Discussion 


Note that the export command need only be used once to flag a variable to be 
exported to child processes. 


Assuming the promptvars shell option is set, which it is by default, prompt 
strings are decoded and expanded (via parameter expansion, command 
substitution, and arithmetic expansion), quotes are removed, and they are 
finally displayed. Prompt strings are $PSO, $PS1, $PS2, $PS3, and $PS4. 


SPSO is only available in bash version 4.4 or newer. For more on this “pre- 
execution” prompt, see the next recipe. 


The command prompt is $PS1. 


The $PS2 prompt is the secondary prompt displayed when bash needs 
more information to complete a command. It defaults to > but you may 
use anything you like. 


$PS3 is the select prompt (see Recipe 3.7, “Selecting from a List of 
Options” and Recipe 6.16, “Creating Simple Menus”), which defaults to 
#?. 

$PS4 is the xtrace (debugging) prompt, with a default of +. Note that the 


first character of $PS4 is replicated as many times as needed to denote 
levels of indirection in the currently executing command: 


$ export PS2='Secondary> ' 
$ for i in * 


Secondary> do 
Secondary> echo $i 
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Secondary> done 
cheesy_app 
data _ file 
hard_to_kill 
mcd 

mode 


$ export PS3='Pick me: ' 


$ select item in 'one two three'; do echo $item; done 
1) one two three 
Pick me: ^C 


$ export PS4='+ debugging> ' 
$ set -x 


$ echo $( echo $( for i in *; do echo $i; done ) ) 

+++ debugging> for i in '*' 

+++ debugging> echo cheesy_app 

+++ debugging> for i in '*' 

+++ debugging> echo data file 

+++ debugging> for i in '*' 

+++ debugging> echo hard_to_kill 

+++ debugging> for i in '*' 

+++ debugging> echo mcd 

+++ debugging> for i in '*' 

+++ debugging> echo mode 

++ debugging> echo cheesy_app data file hard_to_kill mcd mode 
+ debugging> echo cheesy_app data file hard_to_kill mcd mode 
cheesy_app data_file hard_to_kill mcd mode 


Since the $PS1 prompt is only useful when you are running bash 
interactively, the best place to set it is either globally in /etc/bashrc or locally 
in ~/bashre. 


As a style note, we recommend putting a space character as the last character 
in the $PS1 string. It makes it easier to read what is on your screen by 
separating the prompt string from the commands that you type. For this 
reason, and because your string may contain other spaces or special 
characters, it is a good idea to use double or even single quotes to quote the 
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string when you assign it to $PS1. 


There are at least three easy ways to display your current working directory 
(CWD in your prompt: \w, \W, and SPWD. \W will print the basename, or last 
part of the directory path, while \w will print the entire path. Note that both 
will print ~ instead of whatever SHOME is set to when you are in your home 
directory. That drives some people crazy, so to print the entire CWD, use 
SPWD. Printing the entire CWD will cause the prompt to change length, and it 
can even wrap in deep directory structures. That can drive other people crazy. 
If you have bash 4 or newer, just use SPROMPT_DIRTRIM with \w or \W (it 
does not affect $PWD. The Bash Reference Manual describes this variable as 
follows: 


If set to a number greater than zero, the value is used as the number of 
trailing directory components to retain when expanding the \w and \W 
prompt string escapes.... Characters removed are replaced with an 
ellipsis. 


If you can use SPROMPT_DIRTRIM you should, but if you can’t, Example 16-2 
provides a function to truncate the working directory and a prompt to use the 
function. 


Example 16-2. ch16/func_trunc_PWD 


# cookbook filename: func_trunc_PWD 


function trunc_PWD { 
# SPWD truncation code adapted from The Bash Prompt HOWTO: 
# 11.10. Controlling the Size and Appearance of $PWD 
# http: //www.tldp.org/HOWTO/Bash-Prompt-HOWTO/x783.html 


# How many characters of the $PWD should be kept 
local pwdmaxLlen=30 

# Indicator that there has been directory truncation: 
local trunc_symbol='...' 

# Temp variable for PWD 

local myPWD=$PWD 


# Replace any leading part of $PWD that matches $HOME with '~' 
# OPTIONAL, comment out if you want the full path! 
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myPWD=$ {PWD /SHOME/~} 


if [ S${#myPWD} -gt Spwdmaxlen ]; then 
local pwdoffset=$(( ${#myPWD} - Spwdmaxlen )) 
echo "${trunc_symbol}${myPWD:Spwdoffset :Spwdmaxlen}" 
else 
echo "SmyPWD" 
fi 
} 


Here’s a demonstration: 


$ source file/containing/trunc_PWD 


[jp@freebsd ttyp0:3:60] 

~/this is a bunch/of really/really/really/long directories/did I 
mention really/ 

really/lLongSexport PS1="\n[\u@\h \L:SSHLVL:\!]\n$(trunc_PWD)\$ ' 


[jp@freebsd ttyp0:3:61] 
...d I mention really/really/longs 


You will notice that the prompts here are single-quoted so that $ and other 
special characters are taken literally. The prompt string is evaluated at display 
time, so the variables are expanded as expected. Double quotes may also be 
used, though in that case you must escape shell metacharacters, e.g., by using 
\S$ instead of $. 


The command number and the history number are usually different: the 
history number of a command is its position in the history list, which may 
include commands restored from the history file, while the command number 
is the position in the sequence of commands executed during the current shell 
session. 


There is also a special variable called SPROMPT_COMMAND, which if set is 
interpreted as a command to execute before the evaluation and display of 
$PS1. The issue with that, and with using command substitution from within 
the $PS1 prompt, is that these commands are executed every time the prompt 
is displayed, which is often. For example, you could embed a command 
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substitution such as $(ls-1 | wc-l) in your prompt to give you a count of 
files in the current working directory. But on an old or heavily utilized 
system in a large directory, that may result in significant delays before the 
prompt is presented and you can get on with your work. Prompts are best left 
short and simple (notwithstanding some of the monsters shown in the 
Solution section. Define functions or aliases to easily run on demand instead 
of cluttering up and slowing down your prompt. 


To work around ANSI or xterm escapes that produce garbage in your prompt 
if they are not supported, you can use something like this in your rc file: 


case STERM in 
xterm*) export \ 
PS1='\[\033]0;\u@\h: \w\007\]\[\033[1; 34m\ ][\u@\h: \w]\s\[\033[Om\]" ;; 
*) export PS1='[\u@\h:\w]\$ ' 3; 
esac 


See the section “Prompt String Customizations” in Appendix A for more on 
this topic. 


Colors 


In the ANSI example we just discussed, 1;34m means “set the character 
attribute to light, and the character color to blue.” 0m means “clear all 
attributes and set no color.” See “ANSI Color Escape Sequences” in 
Appendix A for the codes. The trailing m indicates a color escape sequence. 


Example 16-3 is a script that displays all the possible combinations. If this 
does not display colors on your terminal, then ANSI color is not enabled or 
supported. 

Example 16-3. ch16/colors 

#!/usr/bin/env bash 


# cookbook filename: colors 


Daniel Crisman's ANSI color chart script from 
The Bash Prompt HOWTO: 6.1. Colours 
http://www. tldp.org/HOWTO/Bash-Prompt-HOWTO/x329.htmlL. 


# RH H +H 
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# HHH H H FHF 


T='gYw' # The test text 


echo -e "\n 


44m 45m 46m 
for FGs in ' mi ' im' 
'1332m' ' 33m' 
'  36m' '1;36m' 


FG=${FGs// /} 


echo -en " SFGs \033[$FG $T 
for BG in 40m 41m 42m 43m 44m 45m 46m 47m; do 
echo -en "SEINS \033[$FG\033[$BG $T \033[0m"; 


done 

echo; 
done 
echo 


This file echoes a bunch of color codes to the 
terminal to demonstrate what's available. Each 
line is the color code of one foreground color, 
out of 17 (default + 16 escapes), followed by a 
test use of that color on all nine background 

colors (default + 8 escapes). 


41m 42m 
"1530m" © 31im' 
' 34m' '1;34m' 
'1;37m'; do 


NOTE 


43m\ 


1:31m' 
35m' 


"` 32m0" \ 
"1350" Ņ 


If you’d like a simple way to try out some colorful themes for your terminal, 


check out Bashish. 


See Also 


= The Bash Reference Manual 


a ./examples/scripts.noah/prompt.bash in the bash source tarball 
a /ttp://www.tldp.org/HOWTO/Bash-Prompt-HOWTO/index.html 
a /Attp://sourceforge.net/projects/bashish 


= Recipe 1.3, “Decoding the Prompt” 
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= Recipe 3.7, “Selecting from a List of Options” 

= Recipe 6.16, “Creating Simple Menus” 

= Recipe 6.17, “Changing the Prompt on Simple Menus” 

= Recipe 16.3, “A Prompt Before Your Program Runs” 

= Recipe 16.12, “Using Secondary Prompts: $PS2, $PS3, $PS4” 
m Recipe 16.20, “Using Initialization Files Correctly” 

m Recipe 16.21, “Creating Self-Contained, Portable re Files” 

= Recipe 16.22, “Getting Started with a Custom Configuration” 
= “Prompt String Customizations” in Appendix A 


a “ANSI Color Escape Sequences” in Appendix A 


16.3 A Prompt Before Your Program Runs 


Problem 


You want to have a prompt print before the program runs, not just after it 
completes. It would be a handy way to timestamp start and finish times. 


Solution 


This solution is only for bash 4.4 or newer. That version of bash introduced 
the $PSO prompt. If set, the prompt string will be evaluated and printed prior 
to the execution of any command that you have typed. 


Here’s a way to use both $PS@ and $PS1 to display start and end timestamps 
of commands that you run: 


PSO=' \t\n' 
e EE EE eek eet aen ee eceeuee \t\n\! \$! 
Discussion 
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The pre-execution prompt is $PSO. It will be displayed just before the shell 
begins to execute your command. The leading blanks are there to move the 
output more to the right; feel free to add more blanks to move it farther to the 
right, or fewer to move it more to the left. Similarly, the dashes in $PS1 move 
the timestamp to the right, and also delineate between commands, for easier 
visual scanning. Again, feel free to add more (or replace them with spaces to 
taste. 


The key element of both of these prompts is the \t. It will be translated into 
the timestamp. The \n is just a newline for proper formatting. If you set both 
prompts, as shown in the solution, and then run a command that may take a 
bit of time, like sleep 5, then you can see the resulting timestamps: 


1037 $ echo 'sleep...' ; sleep 5; echo ‘awake!' 


21:36:59 
sleep 
awake! 
wer rn re eee ee ee ee ee eee eee ee ee eee ee eee 21:37:04 
1038 $ 


TIP 


If you’d like the $PSO prompt to print on the same line as the command that you 
typed, put an \e[A just before the \t. You'll probably want to add more spaces, 
too, to get the timestamp farther to the right. 


To try out the prompt string before setting it, you can use another feature that is 
only for bash version 4.4 or newer. Assign the string to some variable and echo 
its value with the @P operator. For example: 


$ MYTRY=' \! \h \t\n' 
$ echo "S{MYTRY}" 

\! \h \t\n 
$ echo "S{MYTRY@P}" 

1015 monarch 14:07:45 


Without the @P it will echo the characters as you typed them; with the @P it will 
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interpret the special sequences as if it were a prompt string. When you have the 
variable showing what you want, then assign it to $PSQ. 


The timestamp resolution is only in seconds, so it is not meant for precise 

performance measurements, but it can be very useful in looking back over a 
series of commands (especially if you walked away from your screen to get 
coffee) to see what transpired and which commands took a long time to run. 


See Also 


= Recipe 16.2, “Customizing Your Prompt” 


16.4 Changing Your $PATH Permanently 


Problem 


You need to permanently change your path. 


Solution 


First you need to discover where the path is set, and then update it. For your 
local account, it’s probably set in ~/ profile or ~/ bash_profile. Find the file 
with grep -l PATH ~/.[%.]* and edit it with your favorite editor; then 
source the file to have the change take effect immediately. 


If you are root and you need to set the path for the entire system, the basic 
procedure is the same, but there are different files in /etc where the $PATH 
may be set, depending on your operating system and version. The most likely 
file 1s /etc/profile, but /etc/bashrc, /etc/rc, /etc/default/login, 
~/.ssh/environment, and the PAM /etc/environment files are also possible. 


On some systems there is a directory called /etc/profile.d which contains shell 
scripts to be run on startup. You can modify an existing script or add a new 
script to this directory to accomplish your change. The various scripts in this 
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directory are just there as a way to organize or modularize the various 
settings rather than having them all in one big file. 


Discussion 


The grep -l PATH ~/.[%.]* command is interesting because of the nature 
of shell wildcard expansion and the existence of the . and .. directories. See 
Recipe 1.7 for details. 


The locations listed in the $PATH have security implications, especially when 
you are root. If a world-writable directory is in root’s path before the typical 
directories (i.e., /bin, /sbin, then a local user can create files that root might 
execute, doing arbitrary things to the system. This is the reason that the 
current directory (. should not be in root’s path either. 


To avoid this issue: 

= Make voot’s path as short as possible, and never use relative paths. 
m Avoid having world-writable directories in root’s path. 

= Consider setting explicit paths in shell scripts run by root. 


= Consider hardcoding absolute paths to utilities used in shell scripts run by 
root. 


= Put user or application directories last in the $PATH, and then only for 
unprivileged users. 


See Also 
= Recipe 1.7, “Showing All Hidden (Dot) Files in the Current Directory” 


Recipe 4.1, “Running Any Executable” 
Recipe 14.3, “Setting a Secure $PATH” 


Recipe 14.9, “Finding World-Writable Directories in Your $PATH” 


Recipe 14.10, “Adding the Current Directory to the $PATH” 
Recipe 16.5, “Changing Your $PATH Temporarily” 
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16.5 Changing Your $PATH Temporarily 


Problem 


You want to add a directory to your $PATH (or remove one) for this session 
only. 


Solution 
There are several ways to handle this problem. 


You can prepend or append a new directory using PATH="newdtir:SPATH" or 
PATH="SPATH:newdir", though you should make sure the directory isn’t 
already in the $PATH first. 


If you need to edit something in the middle of the path, you can echo the path 
to the screen, then use your terminal’s kill and yank (copy and paste) facility 
to duplicate it on a new line and edit it. Or, you can add the “[mlJacros that are 
convenient for shell interaction” from the readline documentation. 
Specifically: 


# edit the path 

"\C-xp": "PATH=${PATH}\e\C-e\C-a\ef\C-f" 

© Lecce] 

# Edit variable on current Line. 
"\M-\C-v"s "\C-a\C-k$\C-y\M-\C-e\C-a\C- y=" 


Then pressing Ctrl-X P will display the $PATH on the current line for you to 
edit, while typing any variable name and pressing Meta-Ctrl-V will display 
that variable for editing. Very handy. 


For simple cases you can also use the function in Example 16-4 (adapted 
slightly from Red Hat Linux’s /etc/profile). 


Example 16-4. ch16/func_pathmunge 


# cookbook filename: func_pathmunge 


# Adapted from Red Hat Linux 
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function pathmunge { 
if ! echo $PATH | /bin/egrep -q "(“|:)$1($|:)" ; then 
if [ "$2" = "after" ] ; then 
PATH="$PATH: $1" 
else 
PATH="$1:$PATH" 
fi 
fi 
} 


The egrep pattern looks for the value in $1 between two pipe characters (|) 
or colons (:), at the beginning (^) or end ($) of the $PATH string. We chose to 
use a case statement in our function, and to force a leading and trailing : to 
do the same thing. It’s theoretically faster since it uses a shell builtin, but the 
Red Hat version is more concise. Our version is also an excellent illustration 
of the fact that the if command works on exit codes, so the first if works by 
using the exit code set by grep, while the second requires the use of the test 
operator ([]). 


For more complicated cases when yov’d like a lot of error checking, you can 
source and then use the more generic functions in Example 16-5. 


Example 16-5. chl6/func_tweak_path 


# cookbook filename: func_tweak_path 
HELTAH EEE EEE EEETELEEEEEEEEEEEEEEEEEEESEEEPEEEEEEEEEEEEEEEEEE TEEPE EP EEE EHS 


# Add a directory to the beginning or end of your path as long as it's 
not 
# already present. Does not take into account symbolic links! 
# Returns: 1 or sets the new $PATH 
# Called Like: add_to_path <directory> (pre|post) 
function add_to_path { 
local location=$1 
local directory=$2 


# Make sure we have something to work with 
if [ -z "Slocation" -o -z "directory" ]; then 
echo "$0:$FUNCNAME: requires a location and a directory to add" 
>&2 
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echo "e.g. add_to_path pre /bin" >&2 
return 1 
fi 


# Make sure the directory is not relative 
if [ $(echo $directory | grep '*/') ]; then 
:echo "$SQ:SFUNCNAME: 'Sdirectory' is absolute" >&2 


else 
echo "$@Q:SFUNCNAME: can't add relative directory 'S$directory' to 
\SPATH" >&2 
return 1 
fi 


# Make sure the directory to add actually exists 

if [ -d "Sdirectory" ]; then 
: echo "$0:$FUNCNAME: directory exists" >&2 

else 
echo "$0:$FUNCNAME: 'Sdirectory' does not exist--aborting" >&2 
return 1 

fi 


# Make sure it's not already in the $PATH 
if [ $(contains "$PATH" "Sdirectory") ]; then 
echo "$0:$FUNCNAME: '$directory' already in \$PATH--aborting" >&2 
else 
:echo "$0:$FUNCNAME: adding directory to \$PATH" >&2 
fi 


# Figure out what to do 

case location in 
pre* ) PATH="$directory:$PATH" ;; 
post* ) PATH="SPATH:S$directory" ;; 
* ) PATH="SPATH:Sdirectory" ;; 

esac 


# Clean up the new path, then set it 
PATH=$(clean_path $PATH) 
} # end of function add_to_path 


FEEAAE ELE AE LEAP EPEAT EEE EE EEE EEE EE ELLE E ELLE EE tit EL EEP LEAP S EEE PEPE ttt 
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# Remove a directory from your path, if present. 
# Returns: sets the new $PATH 
# Called like: rm_from_path <directory> 
function rm_from_path { 

local directory=$1 


# Remove all instances of S$directory from $PATH 
PATH=S{PATH//S$directory/} 


# Clean up the new path, then set it 
PATH=$(clean_path $PATH) 


} # end of function rm_from_path 


FERAAE LEAH E LEAL LEAP EE EEE EEAP EL EEE EE Ett bb Et tb bbb EE eb EE 


# Remove Leading/trailing or duplicate , remove duplicate entries 
# Returns: echoes the "cleaned up" path 
# Called like: cleaned_path=$(clean_path $PATH) 
function clean_path { 
local path=$1 
local newpath 


local directory 


# Make sure we have something to work with 
[ -z "$path" ] && return 1 


# Remove duplicate directories, if any 
for directory in ${path//:/ }; do 
contains "Snewpath" "Sdirectory" && 
newpath="${newpath}:${directory}" 


done 

# Remove any leading ':' separators 

# Remove any trailing ':' separators 
# Remove any duplicate ':' separators 


newpath=$(echo Snewpath | sed 's/^:*//; s/:*$//; s/::/:/g') 


# Return the new path 
echo Snewpath 


} # end of function clean_path 
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# Determine if the path contains a given directory 
# Return 1 if target is contained within pattern, © otherwise 
# Called like: contains $PATH $dir 
function contains { 
local pattern=":$1:" 
local target=$2 


# This will be a case-sensitive comparison unless nocasematch is set 
case $pattern in 
*:Starget:* ) return 1;; 
* ) return 0;; 
esac 
} # end of function contains 


Use them as follows: 


$ source chpath 


$ echo $PATH 
/bin: /usr/bin: /usr/local/bin: /usr/bin/X11: /usr /X11R6/bin: /home/jp/bin 


$ add_to_path pre foo 
-bash:add_to_path: can't add relative directory 'foo' to the $PATH 


$ add_to_path post ~/foo 
-bash:add_to_path: '/home/jp/foo' does not exist--aborting 


$ add_to_path post '~/foo' 
-bash:add_to_path: can't add relative directory '~/foo' to the $PATH 


$ rm_from_path /home/jp/bin 


$ echo $PATH 
/bin: /usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X11R6/bin 


$ add_to_path /home/jp/bin 


-bash:add_to_path: requires a location and a directory to add 
e.g. add_to_path pre /bin 
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$ add_to_path post /home/jp/bin 


$ echo $PATH 
/bin: /usr/bin: /usr/local/bin: /usr/bin/X11: /usr /X11R6/bin: /home/jp/bin 


$ rm_from_path /home/jp/bin 
$ add_to_path pre /home/jp/bin 


$ echo $PATH 
/home/jp/bin: /bin: /usr/bin: /usr/local/bin: /usr/bin/X11: /usr/X11R6/bin 


Discussion 


There are four interesting things about this problem and the functions 
presented in Example 16-5 in the Solution. 


First, if you try to modify your $PATH or other environment variables in a 
shell script, it won’t work because scripts run in subshells that go away when 
the scripts terminate, taking any modified environment variables with them. 
So instead, we source the functions into the current shell and run them from 
there. 


Second, you may notice that add_to_path post ~/foo returns “does not 
exist” while add_to_path post'~/foo' returns “can’t add relative 
directory.” That’s because ~/foo is expanded by the shell to /home/jp/foo 
before the function ever sees it. Not accounting for shell expansion is a 
common mistake. Use the echo command to see what the shell will actually 
pass to your scripts and functions. 


Next, you may note the use of lines such as echo "$Q:SFUNCNAME: requires 
a location and a directory to add" >&2. $0:SFUNCNAME is a handy 
way to identify exactly where an error message is coming from. $0 is always 
the name of the current program (-bash in the usage examples, and the name 
of your script or program in other cases). Adding the function name makes it 
easier to track down problems when debugging. Echoing to >&2 sends the 
output to STDERR, where runtime user feedback, especially including 
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warnings or errors, should go. 


Finally, you can argue that the functions have inconsistent interfaces, since 
add_to_path and remove_from_path actually set $PATH, while clean_path 
displays the cleaned-up path and contains returns true or false. We might 
not do it that way in production either, but it makes this example more 
interesting and shows different ways to do things. And we might argue that 
the interfaces make sense given what the functions do. 


See Also 

= Recipe 10.5, “Using Functions: Parameters and Return Values” 

= Recipe 14.3, “Setting a Secure $PATH” 

= Recipe 14.9, “Finding World-Writable Directories in Your $PATH” 
= Recipe 14.10, “Adding the Current Directory to the $PATH” 

= Recipe 16.4, “Changing Your $PATH Permanently” 

m Recipe 16.22, “Getting Started with a Custom Configuration” 

=» Appendix B 


16.6 Setting Your $CDPATH 


Problem 


You want to make it easier to switch between several directories in various 
locations. 


Solution 


Set your $CDPATH appropriately. Your commonly used directories will likely 
be unique, so for a contrived example, suppose you spend a lot of time 
working with init’s rc directories: 


/home/jp$ cd rc3.d 


5/3 


bash: cd: rc3.d: No such file or directory 
/home/jp$ export CDPATH='.:/etc' 


/home/jp$ cd rc3.d 
/etc/rc3.d 


/etc/rc3.d$ cd rc5.d 
/etc/rc5.d 


/etc/rc5.dS cd games 
bash: cd: games: No such file or directory 


/etc/rcS.d$ export CDPATH='.:/etc:/usr' 


/etc/rc5.d$ cd games 
/usr/games 


[usr /games$ 


Discussion 


According to the Bash Reference Manual, $CDPATH is “a colon-separated list 
of directories used as a search path for the cd builtin command.” Think of it 
as $PATH for cd. It’s a little subtle, but can be very handy. 


If the argument to cd begins with a slash, SCDPATH will not be used. If 
SCDPATH is used, the absolute pathname to the new directory is printed to 
STDOUT, as in our example. 


WARNING 
Watch out when running bash in POSIX mode (e.g., as /bin/sh or with - - 


posix). As the Bash Reference Manual notes: 


Ifa non-empty directory name from $CDPATH is used, or if - is the first 
argument, and the directory change is successful, the absolute pathname of 
the new working directory is written to the standard output. 


In other words, pretty much every time you use cd it will echo the new path to 
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| STDOUT, which is not the standard behavior. 


Common directories to include in SCDPATH are: 


The current directory (optional because this is implied) 
~/ 


Your home directory 
The parent directory 


The grandparent directory 
~/.dirlinks 


A hidden directory containing nothing but symbolic links to other 
commonly used directories 


These suggestions result in this: 


export CDPATH='.:~/:..:../..:~/.dirlinks' 


See Also 

m help cd 

= The Bash Reference Manual 

= Recipe 16.15, “Creating a Better cd Command” 

= Recipe 16.22, “Getting Started with a Custom Configuration” 
= Recipe 18.1, “Moving Quickly Among Arbitrary Directories” 


16.7 When Programs Are Not Found 
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Problem 


You want better control over what happens when a command is not found, 
perhaps just to give a better error message. 


Solution 
Add something like this to the top of your script, or better, to an rc file: 


function command_not_found_handle () 


{ 
echo "Sorry. $0: $1 not there." 
return 1 
} 
Discussion 


In bash 4.3 and later there is a special function that is called 1f the shell 
cannot find the executable you want to run. The function is called 
command_not_found_handle, and you can (re)define it for your custom 
purposes. In this example we had the function simply echo the name of the 
shell and then the command that couldn’t be found. 


It is important that your function return a nonzero value to indicate that the 
invocation of the command did not succeed. Other parts of your script, or 
other callers of your script, may be depending on that information. 


Some administrators put a definition for the command_not_found_handle 
function in a system-wide bashrc file like /etc/profile or similar. In it, they 
look in /usr/lib or /usr/share for a Python script called command -not - found 
(note the dashes, not underscores). That script looks in packages for the 
command that just failed, to see if it can suggest installing a package to 
provide the missing command. While helpful in some situations, it is just 
noise for those cases where the command was simply mistyped. 


See Also 
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= Recipe 10.4, “Defining Functions” 
= Recipe 10.5, “Using Functions: Parameters and Return Values” 


= Recipe 19.14, “Avoiding “command not found” When Using Functions” 


16.8 Shortening or Changing Command Names 


Problem 


You’d like to shorten a long or complex command you use often, or you’d 
like to rename a command you can’t remember or find awkward to type. 


Solution 


Do not manually rename or move executable files, as many aspects of Unix 
and Linux depend on certain commands existing in certain places; instead, 
you should use aliases, functions, and possibly symbolic links. 


According to the Bash Reference Manual, "Aliases allow a string to be 
substituted for a word when it is used as the first word of a simple command. 
The shell maintains a list of aliases that may be set and unset with the alias 
and unalias builtin commands.” This means that you can rename 
commands, or create a macro, by listing many commands in one alias; for 
example, alias copy='cp' oralias Ll.='ls -ld .*'. 


Aliases are only expanded once, so you can change how a command works, 
as with alias ls='ls -F', without going into an endless loop. In most 
cases only the first word of the command line is checked for alias expansion, 
and aliases are strictly text substitutions; they cannot use arguments to 
themselves. In other words, you can’t do alias='mkdir $1 && cd $1' 
because that doesn’t work. 


Functions are used in two different ways. First, they can be sourced into your 
interactive shell, where they become, in effect, shell scripts that are always 
held in memory. They are usually small, and are very fast since they are 
already in memory and are executed in the current process, not in a spawned 
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subshell. Second, they may be used within a script as subroutines. Functions 
do allow arguments. For example, as seen in Example 16-6 (from Recipe 
6.19: 


Example 16-6. ch06/func_calc 


# cookbook filename: func_calc 


# Trivial command-line calculator 

function calc { 
# INTEGER ONLY! --> echo The answer is: $(( $* )) 
# Floating point 
awk "BEGIN {print \"The answer is: \" $* }"; 

} # end of calc 


For personal or system-wide use, you are probably better off using aliases or 
functions to rename or tweak commands, but symbolic links are very useful 
in allowing a command to be in more than one place at a time. For example, 
Linux systems almost always use /bin/bash while other systems may use 
/usr/bin/bash, /usr/local/bin/bash, or /usr/pkg/bin/bash. While there is a 
better way to handle this particular issue (using env; see Recipe 15.1), in 
general symbolic links may be used as a workaround. We do not recommend 
using hard links, as they are harder to see if you are not looking for them, and 
they are more easily disrupted by badly behaved editors and such. Symbolic 
links are just more obvious and intuitive. 


Discussion 


Usually, only the first word of a command line is checked for alias 
expansion. How-ever, if the last character of the value of that alias is a space, 
the next word will be checked as well. In practice, this is rarely an issue. 


Since in bash aliases can’t use arguments (unlike in csh), you’ ll need to use a 
function if you need to pass in arguments. Because both aliases and functions 
reside in memory, this is not a big difference. 


Unless the expand_aliases shell option is set, aliases are not expanded 
when the shell is not interactive. Best practices for writing scripts dictate that 
you not use aliases, since they may not be present on another system. You 
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also need to define functions inside your script, or explicitly source them 
before use (see Recipe 19.14. Thus, the best place to define them is in your 
global /etc/bashrc or your local ~/ bashrc. 


See Also 


= Recipe 6.19, “Creating a Command-Line Calculator” 

= Recipe 10.4, “Defining Functions” 

= Recipe 10.5, “Using Functions: Parameters and Return Values” 
= Recipe 10.7, “Redefining Commands with alias” 

= Recipe 14.4, “Clearing All Aliases” 

= Recipe 15.1, “Finding bash Portably for #!” 

= Recipe 16.20, “Using Initialization Files Correctly” 

= Recipe 16.21, “Creating Self-Contained, Portable re Files” 

= Recipe 16.22, “Getting Started with a Custom Configuration” 


= Recipe 19.14, “Avoiding “command not found” When Using Functions” 


16.9 Adjusting Shell Behavior and Environment 


Problem 


You want to adjust your shell environment to account for the way you work, 
your physical location, your language, and more. 


Solution 
See the tables in the sections “Builtin Shell Variables”, “set Options”, and 


“shopt Options” in Appendix A. 


Discussion 
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There are three ways to adjust various aspects of your environment. set is 
standardized in POSIX and uses one-letter options. shopt is specifically for 
bash shell options. And there are many environment variables in use for 
historical reasons, as well as for compatibility with many third-party 
applications. How you adjust what, and where, can be be very confusing. The 
tables in Appendix A will help you sort it out, but they’re too big to duplicate 
here. 


See Also 


m help set 

m help shopt 

= The bash documentation (see http://www.bashcookbook.com) 
= “Builtin Shell Variables” in Appendix A 

= “set Options” in Appendix A 

a “shopt Options” in Appendix A 


16. 10 Adjusting readline Behavior Using 
. inputrc 


Problem 


You’d like to adjust the way bash handles input, especially command 
completion. For example, you'd like it to be case-insensitive. 


Solution 

Edit or create a ~/inputrc or /etc/inputrc file, as appropriate. There are many 
parameters you can adjust to your liking. To have readline use your file when 
it initializes, set SINPUTRC; for example, use INPUTRC=~/.inputrc. To 
reread the file and apply or test after making changes, use bind -f 
filenane. 
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We recommend you explore the readline documentation and the bind 
command—especially bind -v, bind -l, bind -s, and bind -p, though the 
last one is rather long and cryptic. 


For more on configuring readline, see “readline Init File Syntax” in 
Appendix A. Some useful settings for users from other environments, notably 
Windows, are: 


This is a SUBSET of interesting inputrc settings, see Chapter 16: 
"Getting Started with a Custom Configuration" for a longer example 
To reread (and implement changes to this file) use: 

bind -f SSETTINGS/inputrc 


+# HH + 


First, include any system-wide bindings and variable 
assignments from /etc/inputrc 

(fails silently if file doesn't exist) 
include /etc/inputrc 


Uy tt H+ = 


Sif Bash 

# Ignore case when doing completion 
set completion-ignore-case on 

# Completed dir names have a slash appended 
set mark-directories on 

# Completed names which are symlinks to dirs have a slash appended 
set mark-symlinked-directories on 

# List ls -F for completion 
set visible-stats on 

# Cycle through ambiguous completions instead of list 
"\C-i": menu-complete 

# Set bell to audible 
set bell-style audible 

# List possible completions instead of ringing bell 
set show-all-if-ambiguous on 


From the readline documentation at 
https: //cnswww.cns.cwru.edu/php/chet/readline/readline.html#SEC12 
Macros that are convenient for shell interaction 
edit the path 
"\C-xp": "PATH=${PATH}\e\C-e\C-a\ef\C-f" 

# prepare to type a quoted word -- insert open and close double 
quotes 


+ HH + 
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# and move to just after the open quote 
ee I ee” 
# insert a backslash (testing backslash escapes in sequences and 
macros) 
NCH AYE 
# Quote the current or previous word 
"\C-xqg": "\eb\"\ef\"" 
# Add a binding to refresh the line, which is unbound 
"\C-xr": redraw-current- line 
# Edit variable on current Line. 
#"\M-\C-v": "\C-a\C-k$\C-y\M-\C-e\C-a\C- y=" 
"\C-xe": "\C-a\C-k$\C-y\M-\C-e\C-a\C-y=" 
Sendif 


You will want to experiment with these and other settings. Also note the 


SincLude to use the system settings, but make sure you can change them if 
you like. See Recipe 16.22 for the downloadable file. 


Discussion 


Many people are not aware of how customizable, not to mention powerful 
and flexible, the GNU Readline library is. Having said that, there is no “one 
size fits all” approach. You should work out a configuration that suits your 
needs and habits. 


Note the first time readline is called it performs its normal startup file 


processing, including looking at SINPUTRC, or defaulting to ~/inputrc if 
that’s not set. 


See Also 

=» help bind 

= The readline docs (see http://www. bashcookbook.com/bashinfo/#readline) 
= Recipe 16.21, “Creating Self-Contained, Portable rc Files” 

= Recipe 16.22, “Getting Started with a Custom Configuration” 

= “readline Init File Syntax” in Appendix A 
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16: 11 Keeping a Private Stash of Utilities by 
Adding /bin 


Problem 


You have a stash of personal utilities you like to use, but you are not root on 
the system and can’t place them into the normal locations like /bin or 
/usr/local/bin, or there is some other reason to separate them. 


Solution 


Create a ~/bin directory, place your utilities in it, and add it to your path: 
PATH="SPATH:~/bin" 


You'll want to make this change in one of your shell initialization files, such 
as ~/.bashrc. Some systems already add SHOME/bin as the last directory in a 
nonprivileged user account by default, so check first. 


Discussion 


As a fully qualified shell user (well, you bought this book), you’ll certainly 
be creating lots of scripts. It’s inconvenient to invoke scripts with their full 
pathname. By collecting your scripts in a ~/bin directory, you can make your 
scripts look like regular Unix programs—at least to you. 


For security reasons, don’t put your bin directory at the start of your path. 
Starting your path with ~/bin makes it easy to override system commands, 
which is inconvenient if it happens accidentally (we’ve all done it), and 
dangerous if it’s done maliciously. 


See Also 
= Recipe 14.9, “Finding World-Writable Directories in Your $PATH” 
= Recipe 14.10, “Adding the Current Directory to the $PATH” 
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= Recipe 16.4, “Changing Your $PATH Permanently” 
= Recipe 16.5, “Changing Your $PATH Temporarily” 


= Recipe 16.8, “Shortening or Changing Command Names” 


9999 


= Recipe 19.4, “Naming Your Script “test 


16.12 Using Secondary Prompts: $PS2, $PS3, 
$PS4 


Problem 
You’d like to understand what the $PS2, $PS3, and $PS4 prompts do. 


Solution 


$PS2 is called the secondary prompt string and is used when you are 
interactively entering a command that you have not completed yet. It 1s 


usually set to >, but you can redefine it. For example: 


$ export PS2='Secondary: ' 


$ for i in $(ls) 
Secondary: do 
Secondary: echo $i 
Secondary: done 
colors 

deepdir 

trunc_PWD 


$PS3 is the select prompt, and is used by the select statement to prompt 
the user for a value. It defaults to #?, which isn’t very intuitive. You should 
change it before using the select command; for example: 


$ select i in $(ls) 
Secondary: do 
Secondary: echo $i 
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Secondary: done 
1) colors 

2) deepdir 

3) trunc_PWD 

#? 1 

colors 

#? AC 


$ export PS3='Choose a directory to echo: ' 


$ select i in $(1ls); do echo $i; done 
1) colors 

2) deepdir 

3) trunc_PWD 

Choose a directory to echo: 2 

deepdir 

Choose a directory to echo: ^C 


$PS4 is displayed during trace output. Its first character is shown as many 
times as necessary to denote the nesting depth. The default is +. For example: 


$ cat demo 
#!/usr/bin/env bash 


set -o xtrace 


alice=girl 
echo "Salice" 


ls -l $(type -path vi) 


echo line 10 
echO line 11 
echo line 12 


$ ./demo 

+ alice=girl 

+ echo girl 

girl 

++ type -path vi 

+ ls -l /usr/bin/vi 
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-r-xr-xr-x 6 root wheel 285108 May 8 2005 /usr/bin/vi 
+ echo line 10 

line 10 

+ echO line 11 

./demo: Line 11: echO: command not found 

+ echo line 12 

line 12 


$ export PS4='+xtrace SLINENO: ' 


$ ./demo 

+xtrace 5: alice=girl 

+xtrace 6: echo girl 

girl 

++xtrace 8: type -path vi 

+xtrace 8: ls -l /usr/bin/vi 

-r-xr-xr-x 6 root wheel 285108 May 8 2005 /usr/bin/vi 
+xtrace 10: echo line 10 

line 10 

+xtrace 11: echO line 11 

./demo: Line 11: echO: command not found 
+xtrace 12: echo line 12 

line 12 


Discussion 


The $PS4 prompt uses the SLINENO variable, which returns the line number in 
the function. Also note the single quotes, which defer expansion of the 
variable until display time. 


See Also 


= Recipe 1.3, “Decoding the Prompt” 

= Recipe 3.7, “Selecting from a List of Options” 

= Recipe 6.16, “Creating Simple Menus” 

= Recipe 6.17, “Changing the Prompt on Simple Menus” 


= Recipe 16.2, “Customizing Your Prompt” 
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= Recipe 19.13, “Debugging Scripts” 


16.13 Synchronizing Shell History Between 
sessions 


Problem 


You run more than one bash session at a time and you would like to have a 
shared history between them. You'd also like to prevent the last session 
closed from clobbering the history from any other sessions. 


Solution 


Use the history command to synchronize your history between sessions 
manually or automatically. 


Discussion 


Using the default settings, the last shell to gracefully exit will overwrite your 
history file, so unless it is synchronized with any other shells you had open at 
the same time, it will clobber their histories. Using the shell option shown in 
Recipe 16.14 to append rather than overwrite the history file helps, but 
keeping your history in sync across sessions may offer additional benefits. 


Manually synchronizing history involves writing an alias to append the 
current history to the history file (history -a), then rereading anything new 
in that file into the current shell’s history (history -n): 


alias hs='history -a ; history -n 


The disadvantage to this approach is that you must manually run the 
commands in each shell when you want to synchronize your history. 


To automate that approach, you could use the SPROMPT_COMMAND variable: 
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PROMPT_COMMAND='history -a ; history -n' 


The value of SPROMPT_COMMAND is interpreted as a command to execute each 
time the default interactive prompt, $PS1, is displayed. The disadvantage to 
that approach is that it runs those commands every time $PS1 is displayed. 
That is very often, and on a heavily loaded or slower system that can cause a 
significant slowdown in your shell, especially if you have a large history. 


See Also 


=» help history 
= Recipe 16.14, “Setting Shell History Options” 


16.14 Setting Shell History Options 


Problem 


You’d like more control over your command-line history. 


Solution 


Set the SHIST* variables and shell options as desired. 


Discussion 


The SHISTFILESIZE variable sets the number of lines permitted in the 
SHISTFILE. The default for SHISTFILESIZE is 500 lines, and SHISTFILE is 
~/.bash_history unless you are in POSIX mode, in which case it’s 
~/.sh_history. Increasing SHISTFILESIZE may be useful, and unsetting it 
causes the SHISTFILE length to be unlimited. Changing SHISTFILE probably 
isn’t necessary, except that if it is not set or the file is not writable, no history 
will be written to disk. The SHISTSIZE variable sets the number of lines 
permitted in the history stack in memory. 
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SHISTIGNORE and SHISTCONTROL control what goes into your history in the 
first place. SHISTIGNORE is more flexible since it allows you to specify 
patterns to decide what command lines to save to the history. SHISTCONTROL 
is more limited in that it sup-ports only the few keywords listed here (any 
other value is ignored: 


ignorespace 


Command lines that begin with a space character are not saved in the 
history list. 


ignoredups 


Command lines that match the previous history entry are not saved in the 
history list. 


ignoreboth 


Shorthand for both ignorespace and ignoredups. 


erasedups 


All previous command lines that match the current line are removed from 
the history list before that line is saved. 


If SHISTCONTROL is not set, or does not contain any of these keywords, all 
commands are saved to the history list, subject to processing SHISTIGNORE. 
The second and subsequent lines of a multiline compound command are not 
tested, and are added to the history regardless of the value of SHISTCONTROL. 


(Material in the preceding paragraphs has been adapted from the Bash 
Reference Manual.) 


If set and non-null, the SHISTTIMEFORMAT variable available in bash 3 and 
later specifies an strftime format string to use when displaying or writing the 
history. If you don’t have bash version 3, but you do use a terminal with a 
scroll-back buffer, adding a date and timestamp to your prompt can also be 
very helpful (see Recipe 16.2). Watch out because stock bash does not put a 
trailing space after the format, but some systems (e.g., Debian) have patched 
it to do so: 
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$ history 
1 ls -la 
2 help history 
3 help fc 
4 history 


# Ugly 


$ export HISTTIMEFORMAT='%Y-%m-%d_%H:%M:%S' 


$ history 


1 2006-10-25_20:48:04ls -la 


nu RW DN 


# Better 


2006-10-25 20:48:11help history 

2006-10-25 20:48:14help fc 

2006-10-25 20:48:18history 

2006-10-25 20:48:39export HISTTIMEFORMAT='%Y-%m-%d_%H:%M:%S' 
2006-10-25 20:48:41history 


$ HISTTIMEFORMAT='%Y-%m-%d_%H:%M:%S;_ ' 


$ history 
1 2006-10-25 20:48 


2006-10-25_ 20:48 
2006-10-25_20:48 


2006-10-25_20:48 
2006-10-25_20:48 


ONDA W BPW YN 


# Getting tricky now 


2006-10-25_20:48: 
2006-10-25_ 20:48: 


2006-10-25_20:48: 


204; 
11; 
14; 
218; 
39; 
241; 
247; 
248; 


ls -la 

help history 

help fc 

history 

export HISTTIMEFORMAT='%Y -%m-%d_%H:%M:%S' 
history 
HISTTIMEFORMAT= '! %Y -%m-%d_%H:%M:%S;_ ' 
history 


$ HISTTIMEFORMAT=': %Y-%m-%d_%H: %M: %S ; 


$ histor 

1 : 2006-10-25 20: 
: 2006-10-25 20: 
: 2006-10-25 20: 
: 2006-10-25 20: 
: 2006-10-25 20: 
: 2006-10-25 20: 
: 2006-10-25 20: 
: 2006-10-25 20: 


ONDA MN BPW YN 


48: 
48: 
48: 
48: 
48: 
48: 
48: 
48: 


04; 
11; 
14; 
18; 
39; 
41; 
47; 
48; 


ls -la 

help history 

help fc 

history 

export HISTTIMEFORMAT='%Y -%m-%d_%H:%M:%S' 
history 
HISTTIMEFORMAT='! %Y -%m-%d_%H:%M:%S;' 
history 
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The last example uses the : builtin with the ; metacharacter to encapsulate 
the date stamp into a “do nothing” command (e.g., : 2006-10- 
25_20:48:48;. This allows you to reuse a literal line from the history file 
without having to bother parsing out the date stamp. Note the space after the 
: is required. 


There are also shell options to configure history file handling. If histappend 
is set, the shell appends to the history file; otherwise, it overwrites the history 
file. Note that it is still truncated to SHISTFILESIZE. If cmdhist is set, 
multiline commands are saved as a single line, with semicolons added as 
needed. If Lithist is set, multiline commands are saved with embedded 
newlines. 


See Also 

=» help history 

m help fc 

= Recipe 6.11, “Looping with a read” 

= Recipe 16.2, “Customizing Your Prompt” 


m Recipe 16.9, “Adjusting Shell Behavior and Environment” 


16.15 Creating a Better cd Command 


Problem 


You cd into a lot of deep directories and would like to be able to type cd ... 
instead of cd ../../../.. to move up four levels. 


Solution 
Use the function in Example 16-7. 


Example 16-7. ch16/func_cd 
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# CO 


# Al 
ANT / 
# Us 
func 


fail 


them 


two 


dir 


} # 


okbook filename: func_cd 

low use of 'cd ...' to cd up 2 levels, 'cd ....' up 3, etc. (like 
4D0S) 

age: cd ..., etc. 

tion cd { 


local option= length= count= cdpath= i= # Local scope and start clean 


# If we have a -L or -P symlink option, save then remove it 


if [ "$1" = "-P" -o "$1" = "-L" ]; then 
option="$1" 
shift 

fi 


# Are we using the special syntax? Make sure $1 isn't empty, then 


# match the first 3 characters of $1 to see if they are '...', then 
# make sure there isn't a slash by trying a substitution; if it 

S, 

# there's no slash. 

if [ -n "$1" -a "${1:0:3}" = '...' -a "$1" = "${1%/*}" ]; then 


# We are using special syntax 
length=${#1} # Assume that $1 has nothing but dots and count 


count=2 # 'cd ..' still means up one level, so ignore first 


# While we haven't run out of dots, keep cd'ing up 1 level 
for ((i=Scount;i<=$length;i++)); do 

cdpath="${cdpath}../" # Build the cd path 
done 


# Actually do the cd 
builtin cd Soption "S$cdpath" 
elif [ -n "$1" ]; then 
# We are NOT using special syntax; just plain old cd by itself 
builtin cd Soption "$*" 
else 
# We are NOT using special syntax; plain old cd by itself to home 


builtin cd Soption 


fi 
end of cd 
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Discussion 


The cd command takes an optional -L or -P argument that, respectively, 
follows symbolic links or the physical directory structure. Either way, we 
have to take them into account if we want to redefine how cd works. 


Then, we make sure $1 isn’t empty and match the first three characters of $1 
to see if they are .... We then make sure there isn’t a slash by trying a 
substitution; if it fails, there’s no slash. Both of these string routines require 
bash version 2.0+. After that, we build the actual cd command using a 
portable for loop and finally use the builtin command to use the shell cd and 
not create an endless loop by recursively calling our cd function. We also 
pass in the -L or -P argument if present. 


See Also 

m help cd 

= /ttp://jpsoft.com for the 4NT shell, which is the source of this idea 

= Recipe 15.5, “Using for Loops Portably” 

= Recipe 16.6, “Setting Your $CDPATH” 

= Recipe 16.16, “Creating and Changing Into a New Directory in One Step” 
= Recipe 16.17, “Getting to the Bottom of Things” 

= Recipe 16.22, “Getting Started with a Custom Configuration” 

= Recipe 18.1, “Moving Quickly Among Arbitrary Directories” 


16. 16 Creating and Changing Into a New 
Directory in One Step 


Problem 


You often create new directories and immediately change into them for some 
operation, and all that typing is tedious. 
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Solution 


Add the function in Example 16-8 to an appropriate configuration file, such 
as your ~/.bashrc file, and source it. 


Example 16-8. ch16/func_mcd 

# cookbook filename: func_mcd 
# mkdir newdir then cd into it 
# usage: mcd (<mode>) <dir> 


function mcd { 
local newdir='_mcd_command_failed_' 


if [ -d "$1" ]; then # Dir exists, mention that... 
echo "$1 exists..." 
newdir="$1" 
else 
if [ -n "$2" ]; then # We've specified a mode 
command mkdir -p -m $1 "$2" && newdir="$2" 
else # Plain old mkdir 
command mkdir -p "$1" && newdir="$1" 
fi 
fi 
builtin cd "newdir" # No matter what, cd into it 


} # end of mcd 
For example: 


$ source mcd 


$ pwd 
/home/jp 


$ mcd 0700 junk 


$ pwd 
/home/jp/ junk 


drwx- ----- 2 jp users 512 Dec 6 01:03 . 


Discussion 
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This function allows you to optionally specify a mode for the mkdir 
command to use when creating the directory. If the directory already exists, it 
will mention that fact but still cd into it. We use the command command to 
make sure that we ignore any shell functions for mkdir, and the builtin 
command to make sure we only use the shell cd. 


We also assign _mcd_command_failed_ to a local variable in case the mkdir 
fails. If it works, the correct new directory is assigned. If it fails, when the cd 
tries to execute it will display a reasonably useful message, assuming you 
don’t have a lot of _mcd_command_failed_ directories lying around: 


$ mcd /etc/junk 
mkdir: /etc/junk: Permission denied 
-bash: cd: _mcd_command_failed_: No such file or directory 


$ 


You might think that we could easily improve this using break or exit if the 
mkdir fails. However, break only works in a for, while, or until loop, and 
exit will actually exit our shell, since a sourced function runs in the same 
process as the shell (we could use return, but we will leave that as an 
exercise for the reader): 


command mkdir -p "$1" && newdir="$1" || exit 1 # This will exit our 
shell 
command mkdir -p "$1" && newdir="$1" || break # This will fail 


You could also place the following in a trivial function, but we obviously 
prefer the more robust version given in the solution: 


function mcd { mkdir "$1" && cd "$1"; } 


See Also 
= man mkdir 


m help cd 
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m help function 

= Recipe 16.15, “Creating a Better cd Command” 

= Recipe 16.20, “Using Initialization Files Correctly” 

m Recipe 16.21, “Creating Self-Contained, Portable rc Files” 

m Recipe 16.22, “Getting Started with a Custom Configuration” 


16. 1% Getting to the Bottom of Things 


Problem 


You work in a lot of narrow but deep directory structures, where all the 
content is at the bottom, and you’re tired of having to manually cd so many 
levels. 


Solution 


Here is an alias you can use to get to the bottom of things: 


alias bot='cd $(dirname $(find . | tail -n 1))' 


Discussion 


This use of find in a large directory structure such as /usr could take a while 
and isn’t recommended. 


Depending on how your directory structure is set up, this may not work for 
you; you'll have to try it and see. The find . will simply list all the files and 
directories in the current directory and below, the tail -n 1 will grab the 
last line, dirname will extract just the path, and cd will take you there. It may 
be possible for you to tweak the command to get it to put you in the right 
place. For example: 


alias bot='cd $(dirname $(find . | sort -r | tail -n 5 | head -1))' 
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or: 


alias bot='cd $(dirname $(find . | sort -r | grep -v 'X11' | tail -n 3 
\ 
| head -1))' 
Keep trying the part in the innermost parentheses, especially tweaking the 
find command, until you get the results you need. Perhaps there is a key file 


or directory at the bottom of the structure, in which case the following 
function might work: 


function bot { cd $(dirname $(find . | grep -e "$1" | head -1)); } 


Note that aliases can’t use arguments, so this must be a function. We use grep 
rather than a -name argument to find because grep is much more flexible. 
Depending on your structure, you might want to use tail instead of head. 
Again, test the find command first. 


See Also 


m man find 

m man dirname 

m man head 

= man tail 

m man grep 

m man sort 

m Recipe 16.15, “Creating a Better cd Command” 

= Recipe 16.16, “Creating and Changing Into a New Directory in One Step” 


16. 18 Adding New Features to bash Using 
Loadable Builtins 
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The material in this recipe also appears in Learning the bash Shell, 3rd 
Edition, by Cameron Newham (O’Reilly. 


Problem 


You have something that you’d like bash to do, but there’s no built-in 
command for it. For efficiency reasons, you want it to be built into the shell 
rather than an external program. Or, you already have the code in C and don’t 
want to or can’t rewrite it. 


Solution 


Use the dynamically loadable builtins introduced in bash version 2.0. The 
bash archive contains a number of prewritten builtins in the directory 
./examples/loadables/, especially the canonical hello.c. You can build them 
by uncommenting the lines in the file Makefile that are relevant to your 
system, and typing make. We’ll take one of these builtins, tty, and use it as a 
case study for builtins in general. 


The following is a list of the builtins provided in bash version 4.4’s 
./examples/loadables/ directory: 


basename.c_ id.c necho.c printenv.c_ strftime.c_ uname.c 
cat.c In.c pathchk.c push.c SYNC.C unlink.c 
dirname.c loadables.h perl realpath.c_ tee.c whoami.c 
finfo.c logname.c perl/bperl.c rmdir.c_ _ template.c_head.c 


mkdir.c perl/iperl.c setpgid.c  truefalse.c_hello.c mypid.c 


Discussion 


On systems that support dynamic loading, you can write your own builtins in 
C, compile them into shared objects, and load them at any time from within 
the shell with the enable builtin. 
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We will discuss briefly how to go about writing a builtin and loading it in 
bash. This discussion assumes that you have experience with writing, 
compiling, and linking C programs. 

tty will mimic the standard Unix command tty. It will print the name of the 
terminal that is connected to standard input. The builtin will, like the 
command, return true if the device is a TTY and false if it isn’t. In addition, it 
will take an option, -s, which specifies that it should work silently (1.e., print 
nothing and just return a result. 


The C code for a builtin can be divided into three distinct sections: the code 
that implements the functionality of the builtin, a help text message 
definition, and a structure describing the builtin so that bash can access it. 


The description structure is quite straightforward and takes the form: 


struct builtin builtin_name_struct = { 
"butltin_name", 
function_nanme, 
BUILTIN_ENABLED, 
help_array, 
"usage", 
0 
}; 


The trailing _struct is required on the first line to give the enable builtin a 
way to find the symbol name. bui lttn_name is the name of the builtin as it 
appears in bash. The next field, functton_name, is the name of the C 
function that implements the builtin. We’ll look at this in a moment. 
BUILTIN_ENABLED is the initial state of the builtin, whether it is enabled or 
not. This field should always be set to BUILTIN_ ENABLED. help array is an 
array of strings that are printed when help is used on the builtin. usage is the 
shorter form of help: the command and its options. The last field in the 
structure should be set to 0. 


In our example we’ll call the builtin tty, the C function tty_builtin, and the 
help array tty_doc. The usage string will be tty [-s]. The resulting 
structure looks like this: 
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struct builtin tty_struct = 


}; 


"EEY 
tty_builtin, 
BUILTIN_ENABLED, 
tty_doc, 

"tty [-s1",, 

0 


The next section is the code that does the work. It looks like this: 


tty_builtin (List) WORD_LIST *list; 


{ 


int opt, sflag; 
char *t; 


reset_internal_getopt ( ); 
sflag = 0; 
while ((opt = internal_getopt (list, "s")) != -1) 


switch (opt) 
case 's': 
sflag = 1; 
break; 
default: 
builtin_usage ( ); 
return (EX_USAGE); 
} 


} 
list = loptend; 


t = ttyname (0); 
if (sflag == 0) 
puts (t ? t : "not a tty"); 
return (t ? EXECUTION_SUCCESS : EXECUTION_FAILURE); 


builtin functions are always given a pointer to a list of type WORD_LIST. If the 
builtin doesn’t actually take any options, you must call no_options(list) 
and check its return value before any further processing. If the return value is 
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nonzero, your function should immediately return with the value EX_USAGE. 


You must always use internal_getopt rather than the standard C library 
getopt to process the built-in options. Also, you must reset the option 
processing first by calling reset_internal_getopt. 


Option processing is performed in the standard way, except if the options are 
incorrect, in which case you should return EX_USAGE. Any arguments left 
after option processing are pointed to by Loptend. Once the function is 
finished, it should return the value EXECUTION_SUCCESS or 
EXECUTION_FAILURE. 


In the case of our tty builtin, we then just call the standard C library routine 
ttyname and, if the -s option wasn’t given, print out the name of the TTY (or 
“not a tty” if the device wasn’t. The function then returns success or failure, 
depending upon the result from the call to ttyname. 


The last major section is the help definition. This is simply an array of 
strings, the last element of the array being NULL. Each string is printed to 
standard output when help is run on the builtin. You should, therefore, keep 
the strings to 76 characters or less (an 80-character standard display minus a 
4-character margin. In the case of tty, our help text looks like this: 


char *tty_doc[] = { 
"tty writes the name of the terminal that is opened for standard", 
"input to standard output. If the `-s' option is supplied, nothing", 
"is written; the exit status determines whether or not the standard", 
"input is connected to a tty.", 

(char *)NULL 

}; 


The last things to add to our code are the necessary C header files. These are 
stdio.h and the bash header files config.h, builtins.h, shell.h, and 
bashgetopt.h. 


Example 16-9 shows the C program in its entirety. 


Example 16-9. ch16/builtin_tty.c 


# cookbook filename: builtin_tty.c 
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#include "config.h" 
#include <stdio.h> 
#include "builtins.h" 
#include "shell.h" 
#include "bashgetopt.h" 


extern char *ttyname ( ); 


tty_builtin (list) 


{ 


} 


WORD_LIST *list; 


int opt, sflag; 
char *t; 


reset_internal_getopt ( ); 


sflag = 0; 
while ((opt = internal_getopt (list, "s")) != -1) 
{ 
switch (opt) 
{ 
case 's': 
sflag = 1; 
break; 
default: 
builtin_usage ( ); 
return (EX_USAGE); 
} 


} 
list = loptend; 


t = ttyname (0); 
if (sflag == 0) 
puts (t ? t : "not a tty"); 
return (t ? EXECUTION_SUCCESS : EXECUTION_FAILURE); 


char *tty_doc[] = { 


"tty writes the name of the terminal that is opened for standard", 

"input to standard output. If the `-s' option is supplied, nothing", 
"is written; the exit status determines whether or not the standard", 
"input is connected to a tty.", 
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(char *)NULL 
le 


struct builtin tty_struct = { 
"tty", 
tty_builtin, 
BUILTIN_ENABLED, 
tty_doc, 
“tty l-es, 
0 

33 


We now need to compile and link this as a dynamic shared object. 
Unfortunately, different systems have different ways to specify how to 
compile dynamic shared objects. 


The configure script should put the correct commands into the Makefile 
automatically. If for some reason it doesn’t, Table 16-1 lists some common 
systems and the commands needed to compile and link fty.c. Replace 
archive with the path of the top level of the bash archive. 


Table 16-1. Common systems and commands to compile and link tty.c 


System Command 


SunOS 4 cc -pic -Iarchive -Iarchive/builtins -Iarchive/lib -c tty.c 


ld -assert pure-text -o tty tty.o 


SunOS 5 cc -K pic -larchive -Iarchive/builtins -Iarchive/lib -c 
tty.c 


cc -dy -z text -G -i -h tty -o tty tty.o 


SVR4, SVR4.2, cc -K PIC -Iarchive -Iarchive/builtins -Iarchive/lib -c 
Trix tty.c 


ld -dy -z text -G -h tty -o tty tty.o 


AIX cc -K -Iarchive -Iarchive/builtins -Iarchive/lib -c tty.c 


ld -bdynamic -bnoentry -bexpall -G -o tty tty.o 


Linux cc -fPIC -Iarchive -Iarchive/builtins -Iarchive/lib -c 
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tty.c 
ld -shared -o tty tty.o 


NetBSD, FreeBSD cc -fpic -Iarchive -Iarchive/builtins -Iarchive/lib -c 
tty.c 


ld -x -Bshareable -o tty tty.o 


After you have compiled and linked the program, you should have a shared 
object called tty. To load this into bash, just type enable -f tty tty. You 
can remove a loaded builtin at any time with the -d option; e.g., enable -d 
tty. 

You can put as many builtins as you like into one shared object as long as the 
three main sections for each builtin are in the same C file. However, bash 
loads a shared object as a whole, so if you ask it to load one builtin from a 
shared object that has 20 builtins, it will load all 20 (but only one will be 
enabled). It’s best to keep the number of builtins small to save loading 
memory with unnecessary things, and to group similar builtins (e.g., pushd, 
popd, dirs) so that if the user enables one of them, all of them will be loaded 
and ready in memory for enabling. 


See Also 


= ./examples/loadables in any bash tarball newer than 2.0 


16.19 Improving Programmable Completion 


This recipe was adapted directly from Learning the bash Shell, 3rd Edition, 
by Cameron Newham (O’Reilly). 


Problem 


You love bash’s programmable completion but wish it could be more aware 
of context, especially for commands that you use often. 
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Solution 


Find and install additional programmable completion libraries, or write your 
own. Some examples are provided in the bash tarball, in ./examples/complete. 
Some distributions (e.g., SUSE have their own version in 
/etc/profile.d/complete. bash. However, the largest and most well known of 
the third-party libraries is certainly Ian Macdonald’s, which you may 
download as a tarball or RPM from 
http://www.caliban.org/bash/index.shtml#completion or 
https://github.com/scop/bash-completion/. This library is already included in 
Debian (and derivatives like Ubuntu and Mint, and it is present in Fedora 
Extras as well as other third-party repositories. 


WARNING 


According to Ian’s README: “Many of the completion functions assume GNU 
versions of the various text utilities that they call (e.g., grep, sed, and awk). 
Your mileage may vary.” 


At the time of this writing there are 103 modules provided by the bash- 
completion-20060301.tar.gz library. The following is an excerpted list: 


= bash alias completion 

= bash export completion 

= bash shell function completion 

=m chown(1) completion 

= chgrp(1) completion 

=» Red Hat & Debian GNU/Linux if{up,down} completion 
= cvs(1) completion 

= rpm(s) completion 

= chsh(1) completion 

= chkconfig(S) completion 
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= ssh(1) completion 

= GNU make(1) completion 

=» GNU sar(/) completion 

m jar(1) completion 

= Linux iptables(8) completion 
= tcpdump(&) completion 

= ncftp(1) bookmark completion 
= Debian dpkg(S) completion 

= Java completion 

= PINE address-book completion 
= mutt completion 

= Python completion 

= Perl completion 

=» FreeBSD package management tool completion 
= mplayer(1) completion 

= gpg(1) completion 

= dict(1) completion 

= cdrecord(1) completion 

= yum(S) completion 

= smartctl(S) completion 

= yncviewer(1) completion 


= svn completion 


Discussion 


Programmable completion 1s a feature that was introduced in bash version 
2.04. It extends the built-in textual completion by providing hooks into the 
completion mechanism. This means that it is possible to write virtually any 
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form of completion desired. For instance, if you were typing the man 
command, wouldn’t it be nice to be able to hit Tab and have the manual 
sections listed for you? Programmable completion allows you to do this and 
much more. 


This recipe will only look at the basics of programmable completion. If you 
need to delve into the inner depths and actually write your own completion 
code, first check the libraries of completion commands developed by other 
people to see if what you want has already been done or is available for use 
as an example. We’ll just outline the basic commands and procedures needed 
to use the completion mechanism, should you ever need to work on it 
yourself. 


To do textual completion in a particular way, you first have to tell the shell 
how to do it when you press the Tab key. This is done via the complete 
command. 

The main argument of complete is a name that can be the name of a 
command or anything else that you want textual completion to work with. As 
an example, we will look at the gunzip utility that allows compressed 
archives of various types to be uncompressed. Normally, if you were to type: 


gunzip [TAB][TAB] 


you would get a list of filenames from which to complete. This list would 
include all kinds of things that are unsuitable for gunzip. What you would 
really like is the subset of those files that are suitable for the utility to work 
on. You can set this up by using complete: 


complete -A file -X '!*.@(Z|gz|tgz)' gunzip 


Note that in order for @(Z|gz|tgz) to work, you will need extended pattern 
matching switched on via shopt -s extglob. 


Here we are telling the completion mechanism that when the gunzip 
command is typed in we want it to do something special. The -A flag is an 
action and takes a variety of arguments. In this case we provide file as the 
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argument, which asks the mechanism to provide a list of files as possible 
completions. The next step is to cut this down by selecting only the files that 
we know will work with gunzip. We’ve done this with the -X option, which 
takes as its argument a filter pattern. When applied to the completion list, the 
filter removes anything matching the pattern—.e., the result is everything 
that doesn’t match the pattern. gunzip can uncompress a number of file types, 
including those with the extensions .Z, .gz, and .tgz. We want to match all 
filenames with extensions that have one of these three patterns. We then have 
to negate this with a ! (remember, the filter removes the patterns that match. 


We can actually try this out first and see what completions would be returned 
without having to use complete to install the completion. We can do this via 
the compgen command: 


compgen -A file -X '!*.@(Z|gz|tgz)' 


This produces a list of completion strings (assuming you have some files in 
the current directory with these extensions). compgen is useful for trying out 
filters to see what completion strings are produced. It is also needed when 
more complex completion is required. We’ll see an example of this later in 
the recipe. 


Once we install the preceding complete command, either by sourcing a script 
that contains it or by executing it on the command line, we can use the 
augmented completion mechanism with the gunzip command: 


$ gunzip [TAB][TAB] 
archive.tgz archivel.tgz file.Z 
$ gunzip 


You can probably see that there are other things we could do. What about 
providing a list of possible arguments for specific options to a command? For 
instance, the kill command takes a process ID, but can optionally take a 
signal name preceded by a dash (-) or a signal name following the option -n. 
We could complete with PIDs, but if there is a dash or a -n, it'll have to be 
done with signal names. 
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This is slightly more complex than the previous one-line example. Here we 
will need some code to distinguish what has already been typed in. We’ll also 
need to get the PIDs and the signal names. We’ll put the code in a function 
and call the function via the completion mechanism. Here’s the code to call 
our function, which we’ll name _kill: 


complete -F _kill kill 


The -F option to complete tells it to call the function named _kill when it is 
performing textual completion for the kill command. The next step is to code 
the function, as seen in Example 16-10. 


Example 16-10. ch16/func_kill 
# cookbook filename: func_kill 
_kill() { 


local cur 
local sign 


COMPREPLY=(_ ) 
cur=${COMP_WORDS[COMP_CWORD]} 


if ((SCOMP_CWORD == 2)) && [[ ${COMP_WORDS[1]} == -n ]]; then 
# return List of available signals 
_Signals 
elif ((SCOMP_CWORD == 1 )) && [[ "Scur" == -* ]]; then 
# return List of available signals 
Sign="-" 
_signals 
else 


# return list of available PIDs 
COMPREPLY=( $( compgen -W 'S( command ps axo pid | sed 1d )' $cur 


)) 
fi 
} 


The code is fairly standard, apart from the use of some special environment 
variables and a call to a function called _signals, which we’ll come to 
shortly. 
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The variable $COMPREPLY is used to hold the result that is returned to the 
completion mechanism. It is an array that holds a set of completion strings. 
Initially this is set to an empty array. 


The local variable $cur is a convenience variable to make the code more 
readable because the value is used in several places. Its value is derived from 
an element in the array $COMP_WORDS. This array holds the individual words 
on the current command line. $COMP_CWORD is an index into the array; it gives 
the word containing the current cursor position. The value of $cur is the 
word currently containing the cursor. 


The first if statement tests for the condition where the kill command is 
followed by the -n option. If the first word was -n and we are on the second 
word, then we need to provide a list of signal names for the completion 
mechanism. 


The second if statement is similar, except this time we are looking to 
complete on the current word, which starts with a dash and is followed by 
anything else. The body of this if again calls _signalLs, but this time it sets 
the sign variable to a dash. The reason for this will become obvious when we 
look at the _signals function. 


The remaining part in the else block returns a list of process IDs. It uses the 
compgen command to help create the array of completion strings. First it runs 
the ps command to obtain a list of PIDs, and then it pipes the result through 
sed to remove the first line (which is the heading “PID”. This is then given 

as an argument to the -W option of compgen, which takes a word list. 
compgen returns all the completion strings that match the value of the 
variable $cur, and the resulting array is assigned to SCOMPREPLY. 


compgen is important here because we can’t just return the complete list of 
PIDs provided by ps. The user may have already typed part of a PID and then 
attempted completion. As the partial PID will be in the variable $cur, 
compgen restricts the results to those that match or partially match that value. 
For example if $cur had the value 5, then compgen would return only values 
beginning with a 5, such as 5, 59, or 562. 
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The last piece of the puzzle is the _signals function (Example 16-11). 
Example 16-11. ch16/func_signals 


# cookbook filename: func_signals 


_Signals() { 
local i 


COMPREPLY=( $( compgen -A signal SIGS{cur#-} )) 


for (( i=0; i < S{#COMPREPLY[@]}; i++ )); do 
COMPREPLY[ i ]=$sign${COMPREPLY[i ]#SIG} 
done 


} 


While we can get a list of signal names by using complete’s -A signal, the 
names are unfortunately not in a form that is very usable, so we can’t use this 
to directly generate the array of names. The names generated begin with the 
letters “SIG”, while the names needed by the kill command don’t. The 
_Signals function should assign an array of signal names to $COMPREPLY, 
optionally preceded by a dash. 


First we generate the list of signal names with compgen. Each name starts 
with the letters “SIG”. In order to get complete to provide the correct subset if 
the user has begun to type a name, we add “SIG” to the beginning of the 
value in cur. We also take the opportunity to remove any preceding dash 
that the value has so it will match. 


We then loop on the array, removing the letters “SIG” and adding a dash if 
needed (the value of the variable sign) to each entry. 

Both complete and compgen have many other options and actions; far more 
than we can cover here. If you are interested in taking programmable 
completion further, we recommend looking in the Bash Reference Manual 
and downloading some of the many examples that are available on the 
internet or in the bash tarball, in ./examples/complete. 


See Also 
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m help complete 

m help compgen 

= ./examples/complete in any bash tarball newer than 2.04 
= /ttp://www.caliban.org/bash/index.shtml#completion 

= /Attps://github.com/scop/bash-completion/ 


16.20 Using Initialization Files Correctly 


Problem 


You’d like to know just what the heck is with all the initialization, or rc, files. 


Solution 


Here’s a cheat sheet for these files and what do with them. Some or all of 
these files may be missing from your system, depending on how it is set up. 
Systems that use bash by default (e.g., Linux) tend to have a complete set, 
while systems that use some other shell by default are usually missing at least 
some of them: 


/etc/profile 


Global login environment file for Bourne and similar login shells. We 
recommend you leave this alone unless you are the system administrator 
and know what you are doing. 


/etc/bashrc (Red Hat) or /etc/bash.bashrc (Debian) 


Global environment file for interactive bash subshells. We recommend 
you leave this alone unless you are the system administrator and know 
what you are doing. 


/etc/bash_completion 


If this exists, it’s almost certainly the configuration file for Ian 
Macdonald’s programmable completion library (see Recipe 16.19). We 
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recommend looking into it—it’s pretty cool. 


/etc/profile.d/bash_completion.sh and /etc/bash_completion.d 
Other possible distribution-specific bash completion files. Common in 
Fedora these days, but may be on other systems. 

/etc/inputrc 
Global GNU readline configuration. We recommend tweaking this as 
desired for the entire system (if you are the administrator), or tweaking 
~/inputrc for just you (see Recipe 16.22). This is not executed or sourced 
but read in via readline and SINPUTRC, and Sinclude (or bind -f). Note 
that it may contain include statements to other readline files. 

~/.bashrc 
Personal environment file for interactive bash subshells. We recommend 
that you place your aliases, functions, and fancy prompts here. 

~/.bash_profile 
Personal profile file for bash login shells. We recommend that you make 
sure this sources ~/ bashrc, then ignore it. 

~/.bash_login 
Personal profile file for Bourne login shells; only used by bash if 
~/.bash_profile is not present. We recommend you ignore this. 

~/ profile 
Personal profile file for Bourne login shells; only used by bash if 
~/.bash_profile and ~/.bash_login are not present. We recommend you 
ignore this unless you also use other shells that use it. 

~/.bash_history 


Default storage file for your shell command history. We recommend you 
use the history tools (Recipe 16.14) to manipulate it instead of trying to 
directly edit it. This is not executed or sourced; it’s just a datafile. 


~/.bash_logout 
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Executed when you log out. We recommend you place any cleanup 
routines (see Recipe 17.7) here. This is only executed on a clean logout 
(i.e., not if your session dies due to a dropped WAN link). 


~/.inputre 


Personal customizations for GNU readline. We recommend tweaking this 
as desired (see Recipe 16.22). This is not executed or sourced but read in 

via readline and SINPUTRC, and $incLude (or bind -f). Note that it may 
contain include statements to other readline files. 


We realize this list is a bit is tricky to follow; however, each OS or 
distribution may differ, since it’s up to the vendor exactly how these files are 
written. To really understand how your system works, read each of the files 
listed here. You can also temporarily add echo name_of_file>&2 to the very 
first line of any of them that are executed or sourced (1.e., skip /etc/inputrc, 
~/inputrc, and ~/.bash_history). Note that may interfere with some programs 
(notably scp and rsync) that are confused by extra output on STDOUT or 
STDERR, so remove these statements when you are finished. See the 
warning in Recipe 16.21 for more details. 


Use Table 16-2 as a guideline only, since it’s not necessarily how your 
system will work. (In addition to the login-related rc files listed in Table 16- 
2, the ~/bash_logout rc file is used when you log out cleanly from an 
interactive session.) 


Table 16-2. bash login rc files on Ubuntu 6.10 and Fedora Core 5 


Interactive login Interactive non- Noninteractive Noninteractive 


shell login shell shell (script) (bash -c ’:’) 
(bash) (bash 
/dev/null) 
Ubuntu 6.10: Ubuntu 6.10: Ubuntu 6.10: Ubuntu 6.10: 
/etc/profile N/A N/A 
/etc/bash.bashre /etc/bash.bashrc 
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~/.bash_profile* 

~/.bashre ~/.bashrc 

/etc/bash_completion /etc/bash_completion 

Fedora Core 5: Fedora Core 5: Fedora Core 5: Fedora Core 5: 
/etc/profile ° N/A N/A 
/etc/profile.d/colorls.sh 

/etc/profile.d/glib2.sh 

/etc/profile.d/krb5.sh 

/etc/profile.d/lang.sh 

/etc/profile.d/less.sh 

/etc/profile.d/vim.sh 


/etc/profile.d/which- 
2.sh 


~/.bash_profile! 
~/.bashre ~/.bashre 
/etc/bashre /etc/bashre 


a If ~/ bash_profile is not found, then ~/ bash_login or ~/ profile will be attempted, in 
that order. 


b If SINPUTRC is not set and ~/inputrc does not exist, set $INPUTRC to /etc/inputre. 


© Red Hat’s /etc/profile also sources /etc/profile.d/*.sh files; see Recipe 4.10 for 
details. 


d If ~/ bash_profile is not found, then ~/bash_login or ~/ profile will be attempted, in 
that order. 


For more detail see the “Bash Startup Files” section in the Bash Reference 
Manual. 
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Discussion 


One of the tricky things in Unix or Linux is figuring out where to change 
something like the $PATH or prompt on the rare occasions when you do want 
to do it for the whole system. Different operating systems and even versions 
can put things in different places. This command has a pretty good chance of 
finding out where your system $PATH is set, for example: 


grep 'PATH=' /etc/{profile, *bash*,*csh*,rc*} 


If that doesn’t work, the only thing you can really do is grep all of /etc to find 
it, as in: 


find /etc -type f | xargs grep 'PATH=' 


Note that unlike most of the code in this book, this is better run as root. You 
can run it as a regular user and get some results, but you may miss something 
and you'll almost certainly get some “Permission denied” errors. 


One of the other tricky things is figuring out what you can tweak and where 
to do that for your personal account. We hope this chapter has given you a lot 
of great ideas in that regard. 


See Also 

m man grep 

m man find 

m man xargs 

= The “Bash Startup Files” section in the Bash Reference Manual 
= Recipe 16.4, “Changing Your $PATH Permanently” 

= Recipe 16.14, “Setting Shell History Options” 

m Recipe 16.19, “Improving Programmable Completion” 


m Recipe 16.21, “Creating Self-Contained, Portable re Files” 
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= Recipe 16.22, “Getting Started with a Custom Configuration” 
= Recipe 17.7, “Clearing the Screen When You Log Out” 


16.21 Creating Self-Contained, Portable rce 
Files 


Problem 


You work on a number of machines, some of which you have limited or full 
root control over and some of which you do not, and you want to replicate a 
consistent bash environment while still allowing custom settings by operating 
system, machine, or other (e.g., work, home) criteria. 


Solution 


Put all of your customizations in files in a settings subdirectory, copy or 
rsync that directory to a location such as ~/ or /etc, and use includes and 
symbolic links (e.g., ln -s ~/settings/screenrc ~/.screenrc) as 
necessary. Use logic in your customization files to account for criteria such as 
operating system, location, etc. 


You may also choose not to use leading dots in the filenames to make it a 
little easier to manage the files. As you saw in Recipe 1.7, the leading dot 
causes /s not to show the files by default, thus eliminating some clutter in 
your home directory listing. But since we’ll be using a directory that exists 
only to hold configuration files, using the dot is not necessary. Note that dot 
files are usually not used in /etc either, for the same reason. 


See Recipe 16.22 for a sample to get you started. 
Discussion 


Let’s take a look at the assumptions and criteria we used in developing this 
solution. First, the assumptions: 
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=» You have a complex environment in which you control some, but not all, 
of the machines you use. 


= For machines you control, one machine exports /opt/bin and all other 
machines NFS-mount it, so all configuration files reside there. We used 
/opt/bin because it’s short and less likely to collide with existing 
directories than /usr/local/bin, but feel free to use whatever makes sense. 


= For some machines with partial control, a system-wide configuration in 
/etc 1s used. 


= For machines on which you have no administrative control, dot files are 
used in ~/. 


= You have settings that will vary from machine to machine, and in different 
environments (e.g., home or work). 


The criteria were as follows: 


= Require as few changes as possible when moving configuration files 
between operating systems and environments. 


= Supplement, but do not replace, operating system default or system 
administrator—supplied configurations. 


= Provide enough flexibility to handle the demands made by conflicting 
settings (e.g., work and home CVS). 


| WARNING | 


While it may be tempting to put echo statements in your configuration files to 
see what’s going on, be careful. If you do that, scp, rsync, and probably any 
other rsh-like programs will fail with mysterious errors such as: 


# scp 
# protocol error: bad mode 


rsync 

protocol version mismatch - is your shell clean? 

(see the rsync manpage for an expLanation) 

rsync error: protocol incompatibility (code 2) at compat. 
c(62) 


eH HH ~ 
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ssh itself works since it is actually interactive and the output is displayed on the 
screen rather than confusing the data stream. See the discussion in Recipe 14.22 
for details on why this happens. 


For debugging, put these two lines near the top of /etc/profile or 
~/.bash_profile, but see the warning we just gave about confusing the data 
stream: 


export PS4='+xtrace SLINENO: ' 
set -x 


As an alternative (or in addition) to using set -x, you can add lines such as 
the following to any or all of your configuration files: 


# E.g. in ~/.bash_profile 
case "$-" in 
*i*) echo "S(date '+%Y-%m-%d_%H:%M:%S_%Z') Interactive" \ 
"~/.bash_profile ssh=$SSSH_CONNECTION" >> ~/rc.log ;; 
* ) echo "S(date '+%Y-%m-%d_%H:%M:%S_%Z') Noninteractive" \ 
"~/.bash_profile ssh=$SSSH_CONNECTION" >> ~/rc.log ;; 


esac 


# In ~/.bashre 
case "$-" in 
*i*) echo "S(date '+%Y-%m-%d_%H:%M:%S_%Z') Interactive" \ 
"~/.bashrc ssh=$SSH_CONNECTION" >> ~/rc.log ;; 
* ) echo "S(date '+%Y-%m-%d_%H:%M:%S_%Z') Noninteractive" \ 
"~/.bashrc ssh=$SSH_CONNECTION" >> ~/rc.log ;; 
esac 


Since there is no output to the terminal this will not interfere with commands, 
as we noted in the warning. Runa tail -f ~/rc.log command in one 
session and run your troublesome command (e.g., scp, cvs) from elsewhere to 
determine which configuration files are in use. You can then more easily 
track down the problem. 


When making any changes to your configuration files, we strongly advise 
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that you open two sessions. Make all your changes in one session and then 
log out and back in. If you broke something so that you can’t log back in, fix 
it from the second session and then try again from the first one. Do not log 
out of both terminals until you are absolutely sure you can log back in again. 
This goes triple if any changes you’re making could affect root. 


NOTE 


You really do need to log out and back in again. Sourcing the changed files is a 
help, but leftovers from the previous environment may allow things to work 
temporarily, until you start clean—and then things are broken. Make changes to 
the running environment as necessary, but don’t change the files until you are 
ready to test; otherwise you’re likely to forget and possibly be locked out if 
something is wrong. 


See Also 

= Recipe 1.7, “Showing All Hidden (Dot) Files in the Current Directory” 
= Recipe 14.23, “Disconnecting Inactive Sessions” 

= Recipe 16.20, “Using Initialization Files Correctly” 

= Recipe 16.22, “Getting Started with a Custom Configuration” 


16.22 Getting Started with a Custom 
Configuration 


Problem 


You'd like to tweak your environment but aren’t quite sure where to start. 


Solution 


Here are some samples to give you an idea of what you can do. We follow 
the suggestion in Recipe 16.21 to keep customizations separate for easy 
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backouts and portability between systems. 


For system-wide profile settings, add the contents on Example 16-12 to 
/etc/profile. Since that file is also used by the true Bourne shell, be careful not 
to use any bash-only features (e.g., source instead of . if you do this on a 
non-Linux system. Linux uses bash as the default shell for both /bin/sh and 
/bin/bash (except when it doesn’t, as in Ubuntu 6.10+, which uses dash. For 
user-only settings, add it to only one of ~/ bash_profile, ~/.bash_login, or 

~/ profile, in that order (whichever exists first. 


Example 16-12. ch16/add_to_bash_profile 


# cookbook filename: add_to_bash_profile 
# Add this code to your ~/.bash_profile 


# If we're running in bash, search for then source our settings 
# You can also just hardcode SETTINGS, but this is more flexible 
if [ -n "SBASH_VERSION" ]; then 
for path in /opt/bin /etc ~ ; do 

# Use the first one found 

if [ -d "Spath/settings" -a -r "Spath/settings" -a -x 
"path/settings" ] 

then 

export SETTINGS="Spath/settings" 


fi 
done 
source "SSETTINGS/bash_profile" 
#source "SSETTINGS/bashrc" # If necessary 


fi 


For system-wide environment settings, add the contents in Example 16-13 to 
/etc/bashrc (or /etc/bash. bashrc). 


Example 16-13. ch16/add_to_bashrc 


# cookbook filename: add_to_bashrc 
# Add this code to your ~/.bashrc 


# If we're running in bash, and it isn't already set, 
# search for then source our settings 
# You can also just hard code $SETTINGS, but this is more flexible 
if [ -n "SBASH_VERSION" ]; then 
if [ -z "SSETTINGS" ]; then 
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for path in /opt/bin /etc ~ ; do 
# Use the first one found 
if [ -d "Spath/settings" -a -r "Spath/settings" -a -x 
"Spath/settings" ] 
then 
export SETTINGS="Spath/settings" 
fi 
done 
fi 
source "SSETTINGS/bashrc" 
fi 


Example 16-14 is a bash_profile. 
Example 16-14. chl6/bash_profile 


# cookbook filename: bash_profile 


# settings/bash_profile: Login shell environment settings 
# To reread (and implement changes to this file) use: 
# source $SETTINGS/bash_profile 


# Only if interactive bash with a terminal! 
[ -t 1 -a -n "SBASH_VERSION" ] || return 


Failsafe. This should be set when we're called, but if not, the 
"not found" error messages should be pretty clear. 

Use leading ':' to prevent this from being run as a program after 
it is expanded. 

: S{SETTINGS:='SETTINGS variable_not_set' } 


DEBUGGING only--will break scp, rsync 
echo "Sourcing $SETTINGS/bash_profile..." 
export PS4='+xtrace SLINENO: ' 

set -x 


+ HH + 


# Debugging/logging--will not break scp, rsync 

#case "S-" in 

# *i*) echo "S(date '+%Y-%m-%d_%H:%M:%S_%Z') Interactive" \ 

# "SSETTINGS/bash_profile ssh=$SSH_CONNECTION" >> ~/rc.log 


# * ) echo "S(date '+%Y-%m-%d_%H:%M:%S_%Z') Noninteractive" \ 
# "SSETTINGS/bash_profile ssh=$SSH_CONNECTION" >> ~/rc.log 
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#esac 


# Use the keychain (http://www. funtoo.org/Keychain/) shell script 
# to manage ssh-agent, if it's available. If it's not, you should Look 
# into adding it. 
for path in $SETTINGS ${PATH//:/ }; do 
if [ -x "Spath/keychain" ]; then 
# Load default id_rsa and/or id_dsa keys, add others here as 
needed 
# See also --clear --ignore-missing --noask --quiet --time-out 
Spath/keychain ~/.ssh/id_?sa ~/.ssh/S{USER}_?sa 
break 
fi 
done 


# Apply interactive subshell customizations to login shells too. 

# The system profile file in /etc probably already does this. 

# If not, it's probably better to do it manually in wherever you: 

# source "SSETTINGS/bash_profile" 

# But just in case... 

# for file in /etc/bash.bashrc /etc/bashrc ~/.bashrc; do 

# [ -r "$file" ] && source $file && break # Use the first one found 
#done 


# Do site- or host-specific things here 
case SHOSTNAME in 
* , company.com ) # source S$SETTINGS/company.com 
33 
host1.* ) # host1 stuff 
33 
host2.company.com ) # source .bashrc.host2 
33 
drake.* ) # echo DRAKE in bash_profile. jp! 


33 


esac 


# Do this last because we basically fork off from here. If we exit 
screen 

# we return to a fully configured session. The screen session gets 
configured 
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# as well, and if we never leave it, well, this session isn't that 
bloated. 


# Only run if we are interactive and not already running screen 
# AND '~/.use_screen' exists. 


if [ "$SPS1" -a STERM != "screen" -a "SUSING SCREEN" != "YES" -a -f 
~/.use_screen ]; \ 
then 


# We'd rather use 'type -P' here, but that was added in bash-2.05b 
and we 
# use systems we don't control with versions older than that. We 
can't 
# easily use 'which' since on some systems that produces output 
whether 
# the file is found or not. 
for path in S{PATH//:/ }; do 
if [ -x "Spath/screen" ]; then 
# If screen(1) exists and is executable, run our wrapper 
[ -x "SSETTINGS/run_screen" ] && SSETTINGS/run_screen 
fi 
done 
fi 


Example 16-15 is a sample bashrc (we know this is long, but read it for 
ideas). 


Example 16-15. ch16/bashrc 


# cookbook filename: bashrc 


# settings/bash_profile: subshell environment settings 
# To reread (and implement changes to this file) use: 
# source $SETTINGS/bashrc 


# Only if interactive bash with a terminal! 
[ -t 1 -a -n "SBASH_VERSION" ] || return 


Failsafe. This should be set when we're called, but if not, the 
"not found" error messages should be pretty clear. 

Use leading ':' to prevent this from being run as a program after 
it is expanded. 

: S{SETTINGS:='SETTINGS variable_not_set' } 


+ HH H 


626 


DEBUGGING only--will break scp, rsync 
echo "Sourcing $SETTINGS/bash_profile..." 
export PS4='+xtrace SLINENO: ' 

set -x 


+ HH + 


Debugging/logging--will not break scp, rsync 
case "S-" in 
*i*) echo "S(date '+%Y-%m-%d_%H:%M:%S_ %Z') Interactive" \ 
"SSETTINGS/bashrc ssh=$SSH_CONNECTION" >> ~/rc.log ;; 
* ) echo "S(date '+%Y-%m-%d_%H:%M:%S_%Z') Noninteractive" \ 
"SSETTINGS/bashrc ssh=$SSH_CONNECTION" >> ~/rc.log ;; 


# RH HH FH 


#esac 


# In theory this is also sourced from /etc/bashrc (/etc/bash.bashrc) 

# or ~/.bashrc to apply all these settings to login shells too. In 
practice 

# if these settings only work sometimes (like in subshells), verify that. 


# Source keychain file (if it exists) for SSH and GPG agents 
[ -r "SHOME/.keychain/$S{HOSTNAME}-sh" ] \ 

&& source "SHOME/.keychain/${HOSTNAME}-sh" 
[ -r "SHOME/.keychain/${HOSTNAME}-sh-gpg" ] \ 

&& source "SHOME/.keychain/${HOSTNAME}-sh-gpg" 


Set some more useful prompts 
Interactive command-line prompt 
# ONLY set one of these if we really are interactive, since lots of 
people 
# (even us sometimes) test to see if a shell is interactive using 
# something like: if [ "SPS1" ]; then 
case "$-" in 
*{*) 
#export PS1='\n[\u@\h t:\l L:SSHLVL h:\! j:\j v:\VJ\nSPWD\s ' 
#export PS1='\n[\u@\h:T\l:LSSHLVL:C\! :\D{%Y-%m- 
%d_%H:%M:%S_%Z}]\nSPWD\S ' 
export PS1='\n[\u@\h:T\l:LSSHLVL:C\!:3\j:\D{%Y-%m- 
%d_%H:%M:%S_%Z}]\nSPWD\S ' 
#export PS2='> ' # Secondary (i.e. continued) prompt 


= 
= 


#export PS3='Please make a choice: # Select prompt 

#export PS4='+xtrace SLINENO: ' # xtrace (debug) 
prompt 

export PS4='+xtrace SBASH_ SOURCE: :SFUNCNAME-SLINENO: ' # xtrace 
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prompt 


# If this is an xterm set the title to user@host:dir 
case "STERM" in 
xterm*|rxvt*) 
PROMPT_COMMAND='echo -ne 
"\Q33]0;S{USER}@S{HOSTNAME} : SPWD\007"' 
esac 


esac 


# Make sure custom inputrc is handled, if we can find it; note different 
# names. Also note different order, since for this one we probably want 
# our custom settings to override the system file, if present. 
for file in SSETTINGS/inputrc ~/.inputrc /etc/inputrc; do 

[ -r "$file" ] && export INPUTRC="$file" && break # Use first found 
done 


# No core files by default 
# See also /etc/security/limits.conf on many Linux systems. 


ulimit -S -c 0 > /dev/null 2>&1 


# Set various aspects of the bash history 


export HISTSIZE=5000 # Num. of commands in history stack in 
memory 
export HISTFILESIZE=5000 # Num. of commands in history file 


#export HISTCONTROL=ignoreboth # bash < 3, omit dups & lines starting 
with spaces 

export HISTCONTROL='erasedups:ignoredups:ignorespace' 

export HISTIGNORE='&:[ ]*' # bash >= 3, omit dups & lines starting 
with spaces 

#export HISTTIMEFORMAT='%Y-%m-%d_%H:%M:%S_ %Z=' # bash >= 3, timestamp 
hist file 


shopt -s histappend # Append rather than overwrite history on 
exit 

shopt -q -s cdspell # Auto-fix minor typos in interactive use 
of 'cd' 

shopt -q -s checkwinsize # Update the values of LINES and COLUMNS 
shopt -q -s cmdhist # Make multiline commands 1 line in history 


set -o notify # (or set -b) # Immediate notif. of background job 
termination. 
set -o ignoreeof # Don't let Ctrl-D exit the shell 
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# Other bash settings 
PATH="$PATH: /opt/bin" 


export MANWIDTH=80 # Manpage width, use < 80 if COLUMNS=80 & 
less -N 

export LC_COLLATE='C' # Set traditional C sort order (e.g. UC 
first) 

export HOSTFILE='/etc/hosts' # Use /etc/hosts for hostname completion 
export CDPATH='.:~/:..:../..' # Similar to $PATH, but for use by 'cd' 

# Note that the '.' in SCDPATH is needed so that cd will work under POSIX 
mode 


# but this will also cause cd to echo the new directory to STDOUT! 
# And see also "cdspell" above! 


# Import bash completion settings, if they exist in the default location 
# and if not already imported (e.g. "SBASH_COMPLETION_COMPAT_DIR" NOT 
set). 
# This can take a second or two on a slow system, so you may not always 
# want to do it, even if it does exist (which it doesn't by default on 
many 
# systems, e.g. Red Hat). 
if [ -z "SBASH_COMPLETION_COMPAT_DIR" ] && ! shopt -oq posix; then 
if [ -f /usr/share/bash-completion/bash_completion ]; then 
. /usr/share/bash-completion/bash_compLetion 
elif [ -f /etc/bash_completion ]; then 
. /etc/bash_completion 
fi 
fi 


# Use a Lesspipe filter, if we can find it. This sets the SLESSOPEN 
variable. 
# Globally replace the $PATH ':' delimiter with space for use in a list. 
for path in SSETTINGS /opt/bin ~/ S{PATH//:/ }; do 

# Use first one found of 'Lesspipe.sh' (preferred) or 'lesspipe' 
(Debian) 

[ -x "Spath/Lesspipe.sh" ] && eval $("Spath/lesspipe.sh") && break 

[ -x "Spath/Lesspipe" ] && eval $("Spath/lLesspipe" ) && break 
done 


# Set other less & editor prefs (overkill) 

export LESS="--LONG-PROMPT --LINE-NUMBERS --ignore-case --QUIET --no- 
init" 

export VISUAL='vi' # Set a default that should always work 
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# We'd rather use 'type -P' here, but that was added in bash-2.05b and we 
use 
# systems we don't control with versions older than that. We can't 
easily 
# use 'which' since that produces output whether the file is found or 
not. 
#for path in S${PATH//:/ }; do 
# # Overwrite VISUAL if we can find nano 
# [ -x "Spath/nano" ] \ 
# && export VISUAL='nano --smooth --const --nowrap --suspend' && 
break 
#done 
# See above notes re: nano for why we're using this for Loop 
for path in S{PATH//:/ }; do 
# Alias vi to vim in binary mode if we can 
[ -x "Spath/vim" ] && alias vi='vim -b' && break 


done 

export EDITOR="S$VISUAL" # Yet Another Possibility 

export SVN_EDITOR="$VISUAL" # Subversion 

alias edit=$VISUAL # Provide a command to use on all systems 


# Set ls options and aliases. 
# Note all the colorizing may or may not work depending on your terminal 
# emulation and settings, esp. ANSI color. But it shouldn't hurt to have. 
# See above notes re: nano for why we're using this for loop. 
for path in S{PATH//:/ }; do 

[ -r "Spath/dircolors" ] && eval "$(dircolors)" \ 

&& LS_OPTIONS='--color=auto' && break 

done 
export LS_OPTIONS="$LS_ OPTIONS -F -h" 
# Using dircolors may cause csh scripts to fail with an 
# "Unknown colorls variable 'do'." error. The culprit is the 
“sdo=01535:" 
# part in the LS_COLORS environment variable. For a possible solution 
see 
# http://forums.macosxhints.com/showthread. php?t=7287 
# eval "S$(dircolors)" 
alias ls="ls $LS_OPTIONS" 
alias Ll="Ls $LS_OPTIONS -l" 
alias ll.="ls $LS_OPTIONS -ld" # Usage: ll. ~/.* 
alias La="Ls $LS_OPTIONS -la" 
alias Irt="ls $LS_OPTIONS -alrt" 
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# Useful aliases 


# Moved to a function: alias bot='cd $(dirname $(find . | tail -1))' 
#alias clip='xsel -b' # pipe stuff into right "X" clipboard 
alias gc='xsel -b' # "GetClip" get stuff from right "X" 
clipboard 


"PutClip" put stuff to right "X" clipboard 
Clear and return $HOME 

DOS-ish for clear 

Start calendars on Monday 

DOS-ish for cp 

Annoying Red Hat default from 


alias pc='xsel -bi' 

alias clr='cd ~/ && clear' 
alias cls='clear' 

alias cal='cal -M' 

alias copy='cp' 

#alias cp='cp -i' 
/root/.bashrce 

alias cvsst='cvs -qn update' # Hack to get concise CVS status (like svn 


# HHH H FHF 


st) 

alias del='rm' # DOS-ish for rm 

alias df='df --print-type --exclude-type=tmpfs --exclude-type=devtmpfs' 
alias diff='diff -u' # Make unified diffs the default 


alias jdiff="\diff --side-by-side --ignore-case --ignore-blank-Llines\ 
--ignore-all-space --suppress-common-lines" # Useful GNU diff command 

alias dir='ls' # DOS-ish for ls 

alias hu='history -n && history -a' # Read new hist. Lines; append 

current Lines 

alias hr='hu' 

alias inxi='inxi -c19' 


# "History update" backward compat to 'hr' 
# (Ubuntu) system information script 
alias ipconfig='ifconfig' # Windows-ish for ifconfig 

alias lesss='Lless -S' # Don't wrap lines 

alias locate='locate -i' # Case-insensitive locate 

alias man='LANG=C man' # Display manpages properly 

alias md='mkdir' # DOS-ish for mkdir 

alias move='mv' # DOS-ish for mv 

#alias mv='mv -i' # Annoying Red Hat default from 
/root/.bashrc 


alias ntsysv='rcconf' # Debian rcconf is pretty close to Red Hat 
ntsysv 

#alias open='gnome-open' # Open files & URLs using GNOME handlers; 
see run below 

alias pathping='mtr' # mtr - a network diagnostic tool 

alias ping='ping -c4' # Only 4 pings by default 

alias r='fc -s' # Recall and execute 'command' starting 
with... 

alias rd='rmdir' # DOS-ish for rmdir 


# Tweaked from http://bit.ly/2fc4e8Z 
alias randomwords="shuf -n102 /usr/share/dict/words \ 
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| perl -ne 'print qq(\u\$_);' | column" 
alias ren='mv' # DOS-ish for mv/rename 
#alias rm='rm -i' # Annoying Red Hat default from 
/root/.bashre 
alias reloadbind='rndc -k /etc/bind/rndc.key freeze \ 

&& rndc -k /etc/bind/rndc.key reload && rndc -k /etc/bind/rndc.key 
thaw' 

# Reload dynamic BIND zones after editing db.* files 
alias svndiff='meld' # Cool GUI diff, similar to TortoiseMerge 
alias svnpropfix='svn propset svn:keywords "id url date"' 
alias svnkey='svn propset svn:keywords "id url"' 
alias svneol='svn propset svn:eol-style' # One of 'native', 'LF', 'CR', 
“CRLF” 
alias svnexe='svn propset svn:executable on' 
alias top10='sort | uniq -c | sort -rn | head' 
alias tracert='traceroute' # DOS-ish for traceroute 
alias vzip='unzip -lvM' # View contents of ZIP file 
alias wgetdir="wget --no-verbose --recursive --no-parent --no-directories 
\ 

--level=1" # Grab a whole directory using wget 
alias wgetsdir="wget --no-verbose --recursive --timestamping --no-parent 
\ 

--no-host-directories --reject 'index.*'" # Grab a dir and subdirs 
alias zonex='host -l' # Extract (dump) DNS zone 


# Date/time 
alias iso8601="date '+%Y-%m-%dT%H:%M:%S%z'" # ISO 8601 time 


alias now="date '+%F %T %Z(%z)'" # More readable ISO 8601 
local 
alias utc="date --utc '+%F %T %Z(%z)'" # More readable ISO 8601 UTC 


# Neat stuff from http://xmodulo.com/useful-bash-aliases-functions.html 


alias meminfo='free -m -l -t'  # See how much memory you have left 
alias whatpid='ps auwx | grep' # Get PID and process info 

alias port='netstat -tulanp' # Show which apps are connecting to the 
network 


# If the script exists and is executable, create an alias to get 
# web server headers 
for path in S{PATH//:/ }; do 
[ -x "Spath/Lwp-request" ] && alias httpdinfo='lwp-request -eUd' && 
break 
done 
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# Useful functions 
# Use 'gnome-open' to "run" things 
function run { 
[ zf "ox" ] && { 
gnome-open "$*" >& /dev/null 
PA { 


echo "'S*' not found or not readable!" 
} 


# Python version of 'perl -c' 
function python-c { 

python -m py_compile "$1" && rm -f "S{1i}c" 
} 


# cd to the bottom of a narrow but deep dir tree 
function bot { 
local dir=${1:-.} 
#\cd $(dirname $(find $dir | tail -1)) 
\cd $(find . -name CVS -prune -o -type d -print | tail -1) 


# mkdir newdir then cd into it 
# usage: mcd (<mode>) <dir> 
function mcd { 
local newdir='_mcd_command_failed_' 


if [ -d "$1" ]; then # Dir exists, mention that... 
echo "$1 exists..." 
newdir="$1" 
else 
if [ -n "$2" ]; then # We've specified a mode 
command mkdir -p -m $1 "$2" && newdir="$2" 
else # Plain old mkdir 
command mkdir -p "$1" && newdir="$1" 
fi 
fi 
builtin cd "$newdir" # No matter what, cd into it 
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} # end of mcd 


# Tr 
func 


} # 


func 


} 


# Al 
ANT / 
# Us 
func 


fail 


them 


two 


ivial command-line calculator 

tion calc { 

# INTEGER ONLY! --> echo The answer is: $(( $* )) 
# Floating point 

awk "BEGIN {print \"$* = \" $* }"; 

#awk "BEGIN {printf \"S* = %f\", $* }"; 

end of calc 

tion addup { 

awk '{sum += $1} END {print sum}' 


low use of 'cd ...' to cd up 2 levels, 'cd ....' up 3, etc. (like 
4D0S) 

age: cd ..., etc. 

tion cd { 


local option= length= count= cdpath= i= # Local scope and start clean 


# If we have a -L or -P symlink option, save then remove it 


if [ "Ss" = aS e -0 bS ia = wo ]; then 
option="$1" 
shift 
fi 
# Are we using the special syntax? Make sure $1 isn't empty, then 
# match the first 3 characters of $1 to see if they are '...', then 
# make sure there isn't a slash by trying a substitution; if it 
S, 
# there's no slash. 
if [ -n "$1" -a "${1:0:3}" = '...' -a "$1" = "S{1%/*}" ]; then 


# We are using special syntax 
length=${#1} # Assume that $1 has nothing but dots and count 


count=2 # 'cd ..' still means up one level, so ignore first 


# While we haven't run out of dots, keep cd'ing up 1 level 
for ((i=Scount;i<=$length;i++)); do 
cdpath="${cdpath}../" # Build the cd path 
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done 


# Actually do the cd 
builtin cd Soption "$cdpath" 
elif [ -n "$1" ]; then 


# We are NOT using special syntax; just plain old cd by itself 


builtin cd Soption "$*" 


else 
# We are NOT using special syntax; plain old cd by itself to home 
dir 
builtin cd Soption 
fi 
} # end of cd 


# Do site- or host-specific things here 
case SHOSTNAME in 

* ,company.com ) # source SSETTINGS/company.com 

host1.* ) # host1 stuff 

33 

host2.company.com ) # source .bashrc.host2 

drake.* ) # echo DRAKE in bashrc. jp! 
export TAPE=/dev/tape 


esac 
Example 16-16 is a sample inputrc. 


Example 16-16. ch16/inputrc 


# cookbook filename: inputrc 

settings/inputrc: # readline settings 

To reread (and implement changes to this file) use: 
bind -f SSETTINGS/inputrc 


+ RH + 


# First, include any system-wide bindings and variable 
# assignments from /etc/inputrc 

# (fails silently if file doesn't exist) 

Sinclude /etc/inputrec 


Sif Bash 
# Ignore case when doing completion 
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+ HH + 


set compLletion-ignore-case on 

Completed dir names have a slash appended 

set mark-directories on 

Completed names which are symlinks to dirs have a slash appended 
set mark-symlinked-directories on 

List ls -F for completion 

set visible-stats on 

Cycle through ambiguous completions instead of list 
"\C-i'": menu-complete 

Set bell to audible 

set bell-style audible 

List possible completions instead of ringing bell 
set show-all-if-ambiguous on 


From the readline documentation at 

https: //cnswww.cns.cwru.edu/php/chet/readline/readLline. htmlL#SEC12 
Macros that are convenient for shell interaction 

edit the path 

"\C-xp": "PATH=S{PATH}\e\C-e\C-a\ef\C-f" 

prepare to type a quoted word -- insert open and close double quotes 
and move to just after the open quote 


e "\"\"\C-b" 


# insert a backslash (testing backslash escapes in sequences and 
macros) 
WV CARN\" ¢ oie 


Ë 


# 


# 


Quote the current or previous word 

"\C-xq": "\eb\"\ef\"" 

Add a binding to refresh the line, which is unbound 
"\C-xr": redraw-current- Line 

Edit variable on current line. 

#"\M-\C-v": "\C-a\C-k$\C-y\M-\C-e\C-a\C-y=" 
"\C-xe": "\C-a\C-k$\C-y\M-\C-e\C-a\C-y=" 


Sendif 


# some defaults / modifications for the emacs mode 
Sif mode=emacs 


# 


# 


allow the use of the Home/End keys 
"\e[1~": beginning-of-line 
"\e[4~": end-of-line 


allow the use of the Delete/Insert keys 
"\e[3~": delete-char 
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"\e[2~": quoted-insert 


# mappings for "page up" and "page down" to step to beginning/end of 
the history 
# "\e[5~": beginning-of-history 
# "\e[6~": end-of-history 


# alternate mappings for "page up" and "page down" to search the 


history 
# "\e[5~": history-search-backward 
# "\e[6~": history-search- forward 


# MUCH nicer up-arrow search behavior! 
"\e[A": history-search-backward ## up-arrow 
"\e[B": history-search-forward ## down-arrow 


# mappings for Ctrl-left-arrow and Ctrl-right-arrow for word moving 
### These were/are broken, and /etc/inputrc has better anyway 
"\e[5C": forward-word 
"\e[5D": backward-word 
"\e\e[C": forward-word 
"\e\e[D": backward-word 


+ HH + 


# for non RH/Debian xterm, can't hurt for RH/Debian xterm 
"\eOH": beginning-of- line 
"\eOF": end-of-Lline 


# for FreeBSD console 
"\e[H": beginning-of-line 
"\e[F": end-of-Lline 

Sendif 


Example 16-17 is a sample bash_logout. 
Example 16-17. chl6/bash_logout 


# cookbook filename: bash_logout 
# settings/bash_Logout: execute on shell logout 
# Clear the screen on logout to prevent information leaks, if not already 


# set as an exit trap elsewhere 
[ -n "SPS1" ] && clear 


637 


Finally, Example 16-18 is a sample run_screen (for GNU screen, which you 
may need to install. 


Example 16-18. ch16/run_screen 


#!/usr/bin/env bash 

# cookbook filename: run_screen 

# run_screen--Wrapper script intended to run from a "profile" file to run 
# screen at login time with a friendly menu. 


# Sanity check 
if [ "STERM" == "screen" -o "STERM" == "screen-256color" ]; then 
printf "%b" "According to \STERM = 'STERM' we're *already* using" \ 
" screen. \nAborting...\n" 
exit 1 
elif [ "SUSING SCREEN" == "YES" ]; then 
printf "%b" "According to \SUSING SCREEN = 'SUSING_SCREEN' we're" 
" *already* using screen. \nAborting...\n" 
exit 1 
fi 


# The "SUSING SCREEN" variable is for the rare cases when screen does NOT 
set 

# STERM=screen. This can happen when 'screen' is not in TERMCAP or 
friends, 

# as is the case on a Solaris 9 box we use but don't control. If we don't 
# have some way to tell when we're inside screen, this wrapper goes into 
an 

# ugly and confusing endless Loop. 


# Seed List with Exit and New options and see what screens are already 
running. 
# The select list is whitespace-delimited, and we only want actual screen 
# sessions, so use perl to remove whitespace, filter for sessions, and 
show 
# only useful info from 'screen -ls' output. 
available_screens="Exit New $(screen -ls \ 

| perl -ne 's/\st+//g; print if s/*(\d+\..*?)(2:\(C.*?\))2(\C.*? 
\ JS /e152\nj7 3°)" 


# Print a warning if using runtime feedback 


run_time_feedback=0 
[ "Srun_time_feedback" == 1 ] && printf "%b" " 
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HEHEHEHEHE HEHEHEHEHE HH HHHH HHHH EEEE EEE EEEEEEEEEEEEEEEEEEEE EEE EE EE EES 
"screen' Notes: 
1) Sessions marked 'unreachable' or 'dead' should be investigated and 


removed with the -wipe option if appropriate. \n\n" 


# Present a list of choices 
PS3='Choose a screen for this session: 
select selection in Savailable_screens; do 


if [ "Sselection" == "Exit" ]; then 
break 
elif [ "Sselection" == "New" ]; then 


export USING _SCREEN=YES 
exec screen -c SSETTINGS/screenrc -a \ 
-S SUSER.S(date '+%Y-%m-%d_%H:%M:%S%z ' ) 
break 
elif [ "Sselection" ]; then 
# Pull out just the part we need using cut 
# We'd rather use a ‘here string’ [S$(cut -d'(' -f1 <<< 
SseLlection) | 
# than this echo, but they are only in bash-2.05b+ 
screen_to_use="$(echo $selection | cut -d'(' -f1)" 
# Old: exec screen -dr $screen_to_use 
# Alt: exec screen -x SUSER/Sscreen_to_use 
exec screen -r SUSER/Sscreen_to_use 
break 
else 
printf "%b" "Invalid selection. \n" 
fi 
done 


Discussion 


See the code and the code’s comments for details. 


Something interesting happens if you set $PS1 at inappropriate times, or if 
you set traps using clear. Many people use code like this to test to see if the 
current shell is interactive: 
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if [ -n "SPS1" ]; then 
: Interactive code here 
fi 


If you arbitrarily set $PS1 if the shell isn’t interactive, or if you set a trap 


using just clear instead of ["$PS1"]&&clear, you’ll get errors like this 
when using scp or ssh noninteractively: 


# e.g. from tput 
No value for TERM and no -T specified 


# e.g. from clear 
TERM environment variable not set. 


See Also 

= Chapter 17, Chapter 18, Chapter 19 

= Recipe 16.20, “Using Initialization Files Correctly” 

= Recipe 16.21, “Creating Self-Contained, Portable rc Files” 
= Recipe 17.5, “Sharing a Single bash Session” 

m Appendix C 
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Chapter 17. Housekeeping and 
Administrative Tasks 


These recipes cover tasks that come up in the course of using or 
administering computers. They are presented here because they don’t fit well 
anywhere else in the book. 


17. 1 Renaming Many Files 


Problem 


You want to rename many files, butmv *.foo *.bar doesn’t work. Or, you 
want to rename a group of files in arbitrary ways. 


Solution 


We presented a simple loop to change file extensions in Recipe 5.18; see that 
recipe for more details. Here is a for loop example: 


for FN in *.bad 
do 

mv "S{FN}" "S{FN%bad}bash" 
done 


What about more arbitrary changes? For example, say you are writing a book 
and want the chapter filenames to follow a certain format, but the publisher 
has a conflicting format. You could name the files like 
ChNN=Title=Author.odt, then use a simple for loop and cut in a command 
substitution to rename them: 


for i in *.odt; do mv "Si" "S(echo "$i" | cut -d'=' -f1,3)"; done 
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Discussion 


You should always use quotes around file arguments in case there’s a space. 
While testing the code in the solution we also used echo and angle brackets to 
make it very clear what the arguments are (using set -x is also helpful. 

Once we were very sure our command worked, we removed the angle 
brackets and replaced echo with mv: 


# Testing 

$ for i in *.odt; do echo "<Si>" "<S(echo "Si" | cut -d'=' -f1,3)>"; 
done 

<ch@1=Beginning Shell Scripting=JP.odt><ch01=JP.odt> 

<ch@2=Standard Output=CA. odt><ch02=CA.odt> 

<ch@3=Standard Input=CA.odt><ch03=CA. odt> 

<ch04=Executing Commands=CA. odt><ch04=CA.odt> 

[...] 


# Even more testing 
$ set -x 


$ for i in *.odt; do echo "<$i>" "<S(echo "Si" | cut -d'=' -f1,3)>"; 
done 

++xtrace 1: echo ch01=Beginning Shell Scripting=JP.odt 
++xtrace 1: cut -d= -f1,3 

+xtrace 535: echo '<ch01=Beginning Shell Scripting=JP.odt> 
'<ch01=JP .odt>' 

<ch01=Beginning Shell Scripting=JP.odt><ch01=JP.odt> 
++xtrace 1: echo ch02=Standard Output=CA.odt 

++xtrace 1: cut -d= -f1,3 

+xtrace 535: echo '<ch02=Standard Output=CA.odt>' '<ch02=CA.odt>' 
<ch02=Standard Output=CA.odt><ch02=CA.odt> 

++xtrace 1: echo ch03=Standard Input=CA.odt 

++xtrace 1: cut -d= -f1,3 

+xtrace 535: echo '<ch03=Standard Input=CA.odt>' '<ch03=CA.odt>' 
<ch03=Standard Input=CA.odt><ch03=CA.odt> 

++xtrace 1: echo ch04=Executing Commands=CA. odt 

++xtrace 1: cut -d= -f1,3 

+xtrace 535: echo '<ch04=Executing Commands=CA.odt>' '<ch04=CA.odt>' 
<ch04=Executing Commands=CA.odt><ch04=CA.odt> 


S set +x 
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+xtrace 536: set +x 


We have for loops like this throughout the book since they’re so handy. The 
trick here is plugging the right values into the arguments to mv, or cp, or 
whatever. In this case we’d already used the = as a delimiter, and all we cared 
about was the first field, so it was pretty easy. 


To figure out the values you need, use the /s (or find) command to list the 
files you are working on and pipe them into whatever toolchain seems 
appropriate—often cut, awk, or sed. bash parameter expansion (Recipe 5.18) 
is also very handy here: 


ls *.odt | cut -d'=' -f1 


Hopefully, a recipe somewhere in the book will give you the details you need 
to come up with the right values for the arguments; then you can just plug all 
the pieces in and go. Be sure to test using echo first and watch out for spaces 

or other odd characters in filenames: they’ Il get you every time. 


TIP 


Don’t name your script rename. We are aware of at least two different rename 
commands in major Linux flavors, and there are certainly many others. Red 
Hat’s util-linux package includes a rename from_string to_string 
file_name tool. Debian and derivatives include Larry Wall’s Perl-based 
rename in their Perl packages, and have a related renameutils package. And 
Solaris, HP-UX, and some BSDs document a rename system call, though that is 
not easily end user—accessible. Try the rename manpage on your system and see 
what you get. 


See Also 


m man Mv 


m man rename 
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m help for 
= Recipe 5.18, “Changing Pieces of a String” 
= Recipe 9.2, “Handling Filenames Containing Odd Characters” 


m Recipe 17.12, “Removing or Renaming Files Named with Special 
Characters” 


= Recipe 19.13, “Debugging Scripts” 


17.2 Using GNU Texinfo and info on Linux 


Problem 


You are having trouble accessing documentation because much of the 
documentation for GNU tools on Linux is in Texinfo documents, the 
traditional manpages are just stubs, and the default info program is user- 
hostile (and you don’t feel like learning yet another single-use program). 


Solution 


Pipe the info command into a useful pager, such as /ess, but note you will 
lose info’s link navigation features: 


info bash | less 


Discussion 


info 1s basically a standalone version of the Emacs info reader, so if you are 
an Emacs fan, maybe it will make sense to you. However, piping it into less 
is a quick and simple way to view the documentation using a tool with which 
you’re already familiar. 


The idea behind Texinfo is good: generate various output formats from a 
single source. It’s not new, since many other markup languages exist to do 
the same thing; we even talk about one in Recipe 5.2. But if that’s the case, 
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why isn’t there a TeX to man output filter? Perhaps because manpages follow 
a standard, structured, and time-tested format while Texinfo is more free- 
form. 


There are other Texinfo viewers and converters if you don’t like info, such as 
pinfo, info2www, tkman, and even info2man (which cheats and converts to 
POD and then to manpage format. 


See Also 


m man info 

m man man 

a /ttp://en.wikipedia.org/wiki/Texinfo 

= Recipe 5.2, “Embedding Documentation in Shell Scripts” 


17.3 Unzipping Many ZIP Files 


Problem 


You want to unzip many ZIP files in a directory, but unzip *.zip doesn’t 
work. 


Solution 


Put the pattern in single quotes, because unlike most other Unix commands, 
unzip handles file globbing patterns itself: 


unzip '*.zip' 
You could also use a loop to unzip each file: 

for x in /path/to/date*/name/*.zip; do unzip "$x"; done 
or: 
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for x in $(ls /path/to/date*/name/*.zip 2>/dev/null); do unzip $x; done 


Discussion 


Unlike many Unix commands (e.g., gzip and bzip2), the last argument to 
unzip isn’t an arbitrarily long list of files. To process the command unzip 
*.zip, the shell expands the wildcard, so (assuming you have files named 
zipfilel.zip to zipfile4.zip) unzip *.zip expands to unzip zipfile1.zip 
zipfile2.zip zipfile3.zip zipfile4.zip. This command attempts to 
extract zipfile2.zip, zipfile3.zip, and zipfile4.zip from zipfile1.zip. The 
command will fail unless zipfile/.zip actually contains files with those names. 


The first method in the Solution section prevents the shell from expanding 
the wildcard by using single quotes. However, that only works if there is only 
one wildcard. The second and third methods work around that by running an 
explicit unzip command for each ZIP file found when the shell expands the 
wildcards, or returns the result of the /s command. 


The /s version is used because the default behavior of bash (and sh) is to 
return unmatched patterns unchanged. That means you would be trying to 
unzip a file called /path/to/date*/name/*.zip if no files matched the wildcard 
pattern. /s will simply return null on STDOUT, and an error that we throw 
away on STDERR. You can set the shopt -s nullgLob option to cause 
filename patterns that match no files to expand to a null string, rather than 
themselves. 


See Also 


m man unzip 
a http:/www.info-zip.org 
m Recipe 15.13, “Working Around “Argument list too long” Errors” 


17.4 Recovering Disconnected Sessions Using 
screen 
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Problem 


You run long processes over SSH, perhaps over the WAN, and when you get 
disconnected you lose a lot of work. Or perhaps you started a long job from 
work, but need to go home and be able to check on the job later; you could 
run your process using nohup, but then you won’t be able to reattach to it 
when your connection comes back or you get home. 


Solution 
Install and use GNU screen. 


Using screen is very simple. Type screen or screen -a. The -a option 
includes all of screen’s capabilities, at the expense of some redraw (thus 
bandwidth efficiency. Honestly, we use -a but have never noticed a 
difference. 


When you do this, it will look like nothing happened, but you are now 
running inside a screen. echo $SHLVL should return a number greater than 
one if this worked (see also $SHLVL in Recipe 16.2. To test it, do an ls -la, 
then kill your terminal (do not exit cleanly, as you will exit screen as well. 
Log back into the machine and type screen -r to reconnect to screen. If that 
doesn’t put you back where you left off, try screen -d -r. If that doesn’t 
work, try ps auwx | grep [s]creen to see if screen is still running, and 
then try man screen for troubleshooting information—but it should just 
work. If you run into problems with that ps command on a system other than 
Linux, see Recipe 17.21. 


Starting screen with something like the following will make it easier to figure 
out what session to reattach to later if necessary: 


screen -aS "S(whoami).$(date$$'$$ $$+SS%Y-%mM-%d$$_$S%H:%M:%S%ZSS 'SS) 


See the run_screen script in Recipe 16.22. 


To exit out of screen and your session, keep typing exit until all the sessions 
are gone. You can also type Ctrl-A Ctrl-\ or Ctrl-A quit to exit screen itself 
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(assuming you haven’t changed the default meta key of Ctrl-A yet). 


Discussion 
According to the screen website: 


Screen is a full-screen window manager that multiplexes a physical 
terminal between several processes (typically interactive shells). Each 
virtual terminal provides the functions of the DEC VT100 terminal and, in 
addition, several control functions from the ANSI X3.64 (ISO 6429) and 
ISO 2022 standards (e.g., insert/delete line and support for multiple 
character sets). There is a scrollback history buffer for each virtual 
terminal and a copy-and-paste mechanism that allows the user to move 
text regions between windows. 


That means you can have more than one session in a single SSH terminal 
(think Desk View on 1286/386). But it also allows you to SSH into a machine, 
start a process, disconnect your terminal and go home, then reconnect and 
pick up—not where you left off, but where the process has continued to. And 
it allows multiple people to share a single session for training, 
troubleshooting, or collaboration (see Recipe 17.5). 


Caveats 


screen 1s Often installed by default on Linux, but rarely on other systems. The 
screen binary must run as SUID root so it can write to the appropriate 
/usr/dev pseudoterminals (PTYs). If screen doesn’t work, this is a likely 
reason why (to fix it, run the command chmod u+s /usr/bin/screen as 
root). 


Also, screen interferes with inline transfer protocols like zmodem. Newer 
versions of screen have configuration settings that deal with this; see the 
manpages. 

Configuration 


The default Emacs mode of bash command-line editing uses Ctrl-A to go to 
the start of the line. That’s also the screen command mode, or meta key, so if 
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you use Ctrl-A a lot (like we do), you may want to add the following to your 
~/.screenrc file: 


# Sample settings for ~/.screenrc 
# Change the C-a default to C-n (use C-n n to send literal “N) 
escape “Nn 


# Yes annoying audible bell, please 
vbell off 


# Detach on hangup 
autodetach on 


# Make the shell in every window a login shell 
shell -SSHELL 


See Also 


m screen manpage 

a /Attp://www.gnu.org/software/screen 

a /ttp://en.wikipedia.org/wiki/GNU_Screen 

= http://aperiodic.net/screen 

= Recipe 16.2, “Customizing Your Prompt” 

= Recipe 16.22, “Getting Started with a Custom Configuration” 
= Recipe 17.5, “Sharing a Single bash Session” 

= Recipe 17.6, “Logging an Entire Session or Batch Job” 

= Recipe 17.9, “Creating an Index of Many Files” 


= Recipe 17.20, “Grepping ps Output Without Also Getting the grep Process 
Itself” 


17.5 Sharing a Single bash Session 
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Problem 


You need to share a single bash session for training or troubleshooting 
purposes, and there are too many people for “over the shoulder” to work. Or 
you need to help someone who’s located somewhere else, and you need to 
share a session across a network. 


Solution 


Use GNU screen in multiuser mode. The following assumes that you have 
not changed the default meta key from Ctrl-A, as described in Recipe 17.4. If 
you have, then use your new meta key (e.g., Ctrl-N instead. 


As the host, do the following: 

m Enter screen -S session_name (no spaces allowed); e.g., screen -S 
training. 

= Type Ctrl-A addacl usernames, listing the accounts (comma-delimited, 


no spaces!) that may access the display; e.g., Ctrl-A addacl 
alice,bob,carol. Note this allows full read/write access. 


m Use the Ctrl-A chacl usernames permbits List command to refine 
permissions if needed. 


= Turn on multiuser mode with Ctrl-A multiuser on. 
As the viewer, do this: 


m Use screen -x user/name to connect to a shared screen; e.g., screen -x 
host/training. 


= Hit Ctrl-A K to kill the window and end the session. 


Discussion 
See Recipe 17.4 for necessary details. 


For multiuser mode, /tmp/screens must exist and be world-readable and 
executable. 
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screen versions 3.9.15-8 to 4.0.1-1 from Red Hat (i.e., RHEL3 are broken 
and should not be used if you want multiuser mode to work. Version 4.0.2-5 
or later should work; for example, http://bit.ly/2y9ufL4 (or later works even 
on RHEL3. Once you start using the new version of screen, existing screen 
sockets in $HOME/ screen are not found and are thus orphaned and unusable. 
Log out of all sessions, and use the new version to create new sockets in 
/tmp/screens/S-$USER, then remove the $HOME/ screen directory. 


See Also 


m man screen 

a /Attp://www.gnu.org/software/screen 

= Recipe 9.11, “Finding a File Using a List of Possible Locations” 
= Recipe 16.22, “Getting Started with a Custom Configuration” 

= Recipe 17.4, “Recovering Disconnected Sessions Using screen” 


= Recipe 17.6, “Logging an Entire Session or Batch Job” 


17.6 Logging an Entire Session or Batch Job 


Problem 


You need to capture all the output from an entire session or a long batch job. 


Solution 


There are many ways to solve this problem, depending on your needs and 
environment. 


The simplest solution is to turn on logging to memory or disk in your 
terminal program. The problems with that are that your terminal program 
may not allow it, and when it gets disconnected you lose your log. 


The next simplest solution is to modify the job to log itself, or redirect the 
entire thing to fee or a file. For example, one of the following might work: 


651 


long_noisy_job >& log_file 
long_noisy_job 2>&1 | tee log file 


( long_noisy_job ) >& log_file 
( long_noisy_job ) 2>&1 | tee log _file 


The problems here are that you may not be able to modify the job, or the job 
itself may do something that precludes these solutions (e.g., if it requires user 
input, it could get stuck asking for the input before the prompt is actually 
displayed). That can happen because STDOUT is buffered, so the prompt 
could be in the buffer waiting to be displayed when more data comes in, but 
no more data will come in since the program is waiting for input. 


The third solution is to use an interesting program called script that exists for 
this very purpose, and its probably already on your system. You run script, 
and it logs everything that happens to the logfile (called a typescript) you’ ve 
given it, which is OK if you want to log the entire session—just start script, 
then run your job. But if you only want to capture part of the session, there is 
no way to have your code start script, run something to log it, then stop script 
again. You can’t script script because once you run it, you’re in a subshell at 
a prompt (1.e., you can’t do something like script file_to_log_to 
some_command_to_run). 


Our final solution uses the terminal multiplexer screen. With screen, you can 
turn whole session logging on or off from inside your script. Once you are 
already running screen, do the following in your script: 


# Set a logfile and turn on Logging 
screen -X logfile /path/to/logfile && screen -X log on 


# Your commands here 
# Turn Logging back off 
screen -X logfile 1 # Set buffer to 1 sec 


sleep 3 # Wait to avoid file truncation... 
screen -X log off 


Discussion 
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We suggest you try the solutions in order, and use the first one that meets 
your needs. Unless you have very specific needs, script will probably work. 
But just in case, it can be handy to know about the screen option. 


See Also 


m man script 
m man screen 


m Recipe 17.5, “Sharing a Single bash Session” 


17.7 Clearing the Screen When You Log Out 


Problem 


You use or administer some systems that do not clear the screen when you 
log out, and you’d rather not leave the tail end of whatever you were working 
on visible, since that could be an information leak. 


Solution 


Put the clear command in your ~/.bash_logout (Example 17-1, reproduced 
from Recipe 16.22). 


Example 17-1. ch16/bash_logout 


# cookbook filename: bash_logout 


# settings/bash_Logout: execute on shell logout 


# Clear the screen on logout to prevent information leaks, if not already 


# set as an exit trap elsewhere 
[ -n "$PS1" ] && clear 


Or set a trap to run clear on shell termination: 


# Trap to clear the screen on exit from the shell to prevent 
# information leaks, if not already set in ~/.bash_logout 
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trap ' [ -n "SPS1" ] && clear ' 0 


Note that if you are connecting remotely and your client has a scrollback 
buffer, whatever you were working on may still be in there. clear also has no 
effect on your shell’s command history. 


Discussion 


Setting a trap to clear the screen is probably overkill, but could conceivably 
cover an error situation in which ~/ bash_logout is not executed. If you are 

really paranoid you can set both, but in that case you may also wish to look 
into TEMPEST and Faraday cages. 


If you skip the test to determine whether the shell is interactive, you’ ll get 
errors like these under some circumstances: 


# e.g., from tput 
No value for TERM and no -T specified 


# e.g., from clear 
TERM environment variable not set. 


See Also 

a http:/en.wikipedia.org/wiki/ TEMPEST 

= /Attp://en.wikipedia.org/wiki/Faraday_cage 

= Recipe 16.22, “Getting Started with a Custom Configuration” 


17.8 Capturing File Metadata for Recovery 


Problem 


You want to create a list of files and details about them for archive purposes; 
for example, to verify backups, recreate directories, etc. Or maybe you are 
about to do a large chmod -R and need a backout plan, or perhaps you keep 
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/etc/* in a revision control system that does not preserve permissions or 
ownership. 


Solution 
Use GNU find with some printf formats, as seen in Example 17-2. 


Example 17-2. chl7/archive_ meta-data 


#!/usr/bin/env bash 
# cookbook filename: archive_meta-data 


printf "%b" "Mode\tUser\tGroup\tBytes\tModified\tFileSpec\n" > 
archive_file 
find / \( -path /proc -o -path /mnt -o -path /tmp -o -path /var/tmp \ 
-o -path /var/cache -o -path /var/spool \) -prune \ 
-o -type d -printf 'd%m\t%u\t%g\t%s\t%t\t%p/\n' \ 
-o -type l -printf 'l%m\t%u\t%g\t%s\t%t\t%p -> %l\n' \ 
-0 -printf '%m\t%u\t%g\t%s\t%t\t%p\n' >> archive_file 


WARNING 


Note that the -printf expression is in the GNU version of find. 


Discussion 


The (-path /proc -o -path...) -prune part removes various directories 
you probably don’t want to bother with. -type d is for directories. The printf 
format is prefixed with a d, then uses an octal mode, user, group, and so 

forth. -type lis for symbolic links and also shows you where each link 
points. With the contents of this file and some additional scripting, you can 
determine at a high level if anything has changed, or recreate mangled 
ownership or permissions. Note that this does not take the place of more 
security-oriented programs like Tripwire, AIDE, or Samhain. 


See Also 
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m man find 

= Chapter 9 

= /Attps://www.tripwire.com/ 

a http://aide.sourceforge.net/ 

€ /Attp://la-samhna.de/samhain/index.html 


17.9 Creating an Index of Many Files 


Problem 


You have a number of files for which you’d like to create an index. 


Solution 


Use the find command in conjunction with head, grep, or other commands 
that can parse out comments or summary information from each file. 


For example, if the second line of all your shell scripts follows the format 
“name— description” then this example will create a nice index: 


for i in S(grep -ELl '#![[:space:]]?/bin/sh' *); do head -2 $i | tail 
-1; done 


Discussion 


As noted, this technique depends on each file having some kind of summary 
information, such as comments, that may be parsed out. We then look for a 
way to identify the type of file, in this case a shell script, and grab the second 
line of each file. 


If the files do not have easily parsed summary information, you can try 
something like this and manually work through the output to create an index: 


for dir in S(find . -type d); do head -15 $dir/*; done 
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| WARNING 


| Watch out for binary files! 


See Also 


= man find 
=m man grep 
= man head 


m man tail 


17.10 Using diff and patch 


Problem 


You can never remember how to use diff to create patches that may later be 
applied using patch. 


Solution 


If you are creating a simple patch for a single file, use: 


$ diff -u original_file modified_file > your_patch 
$ 


If you are creating a patch for multiple files in parallel directory structures, 
use: 


$ cp -pR original_dirs/ modified_dirs/ 
$ 


# Make changes here 


$ diff -Nru original_dirs/ modified_dirs/ > your_comprehensive_patch 
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To be especially careful, force diff to treat all files as ASCII using -a, and set 
your language and time zone to the universal defaults as shown: 


$ LC_ALL=C TZ=UTC diff -aNru original_dirs/ modified_dirs/ \ 
> > your_comprehensive_patch 


$ 


$ LC_ALL=C TZ=UTC diff -aNru original_dirs/ modified_dirs/ 

diff -aNru original_dirs/changed_ file modified_dirs/changed_file 
--- original_dirs/changed_file 2006-11-23 01:04:07.000000000 +0000 
+++ modified_dirs/changed_ file 2006-11-23 01:04:35.000000000 +0000 
@@ -1,2 +1,2 @@ 

This file is common to both dirs. 

-But it changes from one to the other. 

+But it changes from 1 to the other. 

diff -aNru original_dirs/only_in_mods modified_dirs/only_in_mods 
--- original_dirs/only_in_mods 1970-01-01 00:00:00.000000000 +0000 
+++ modified_dirs/only_in_mods 2006-11-23 01:05:58.000000000 +0000 
@@ -0,0 +1,2 @@ 

+While this file is only in the modified dirs. 

+It also has two lines, this is the last. 

diff -aNru original_dirs/only_in_orig modified_dirs/only_in_orig 
--- original_dirs/only_in_orig 2006-11-23 01:05:18.000000000 +0000 
+++ modified_dirs/only_in_orig 1970-01-01 00:00:00.000000000 +0000 
@@ -1,2 +0,0 @@ 

-This file is only in the original dirs. 

-It has two lines, this is the last. 


To apply a patch file, cd to the directory of the single file or to the parent of 
the directory tree and use the patch command: 


$ cd /path/to/files 
$ patch -Np1 < your_patch 


The -N argument to patch prevents it from reversing patches or reapplying 
patches that have already been made. -p number removes number of leading 
directories to allow for differences in directory structure between whoever 
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created the patch and whoever is applying it. Using -p1 will often work; if 
not, experiment with -p0, then -p2, etc. It’Il either work or complain and ask 
you what to do, in which case you cancel and try something else unless you 
really know what you are doing. 


LLP 


The patch command supports an option called --dry-run that will, in the 
words of the manpage, “print the results of applying the patches without 
actually changing any files”’—worth doing before you run the command for 
real. 


Discussion 


diff can produce output in various forms, some of which are more useful than 
others. Unified output, using -u, is generally considered the best because it is 
both reasonably human-readable and very robust when used with patch. It 
provides three lines of context around the change, which allows a human 
reader to get oriented, and allows the patch command to work correctly even 
if the file to be patched is different from the one used to create the patch. As 
long as the context lines are intact, patch can usually figure it out. Context 
output, using -c, is similar to -u output but is more redundant and not quite 
as easy to read. The ed format, using -e, produces a script suitable for use 
with the ancient ed editor. Finally, the default output is similar to the ed 
output, with a little more human-readable context: 


# Unified format (preferred) 

$ diff -u original_file modified_file 

--- original_file 2006-11-22 19:29:07.000000000 -0500 
+++ modified file 2006-11-22 19:29:47.000000000 -0500 
@@ -1,9 +1,9 @@ 

-This is original_file, and this line is different. 
+This is modified_file, and this line is different. 

This line is the same. 

So is this one. 

And this one. 
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Ditto. 

-But this one is different. 
+But this 1 is different. 

However, not this line. 

And this is the last same, same, same. 


# Context format 
$ diff -c original_file modified_file 


*** original_file Wed Nov 22 19:29:07 2006 
--- modified file Wed Nov 22 19:29:47 2006 
kkkkkkkkkkkkkkk 


KKK 1,9 KKKK 


! This is original_file, and this line is different. 
This line is the same. 
So is this one. 
And this one. 
Ditto. 
! But this one is different. 
However, not this Line. 
And this is the last same, same, same. 


sae iO es 
! This is modified_file, and this line is different. 
This line is the same. 
So is this one. 
And this one. 
Ditto. 
! But this 1 is different. 
However, 


# 'ed' format 

$ diff -e original_file modified_file 

6c 

But this 1 is different. 

1c 

This is modified_file, and this line is different. 


# Normal format 
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$ diff original_file modified_file 
1c1 
< This is original_file, and this line is different. 


> This is modified_file, and this line is different. 
6c6 
< But this one is different. 


> But this 1 is different. 


The -r and -N arguments to diff are simple yet powerful. -r means, as usual, 
recursive operation though the directory structure, while -N causes diff to 
pretend that any file found in one directory structure also exists in the other 
as an empty file. In theory, that has the effect of creating or removing files as 
needed; however, in practice -N is not supported on all systems (notably 
Solaris) and it may end up leaving zero-byte files lying around on others. 
Some versions of patch default to using -b, which leaves lots of .orig files 
laying around, and some versions (notably Linux) are less chatty than others 
(notably BSD). Many versions (not Solaris) of diff also support the -p 
argument, which tries to show which C function the patch affects. 


Resist the urge to do something like diff -u prog.c.orig prog.c. This 
has the potential to cause all kinds of confusion since patch may also create 
.orig files. Also resist the urge to do something like diff -u prog/prog.c 
new/prog/prog.c, since patch will get very confused about the unequal 
number of directory names in the paths. 


WDIFF 


There is another little-known tool called wdiff that is also of interest here. wdiff 
compares files to detect changes in words, as defined by surrounding whitespace. It 
can handle differing line breaks and tries to use termcap strings to produce more 
readable output. It can be handy when comparing line-by-line is not granular enough, 
and it is similar to the word diff feature of Emacs and git diff --word-diff. Note 
that it is rarely installed on a system by default. You can get it from the Free Software 
Directory or via your system’s packaging manager. Here is an example of wdiff s 
output: 
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$ wdiff original_file modified_file 

This is [-original_file,-] {+modified_file,+} and this line is different. 
This line is the same. 

So is this one. 

And this one. 

Ditto. 

But this [-one-] {+1+} is different. 

However, not this Line. 

And this is the last same, same, same. 


$ 


See Also 
m man diff 
m man patch 


m man cmp 
= /ttps://directory.fsf.org/wiki/Wdiff 
= /ttp://furius.ca/xxdiff/ for a great GUI diff (and more) tool 


17. 11 Counting Differences in Files 


Problem 


You have two files and need to know about how many differences exist 
between them. 


Solution 
Count the hunks (1.e., sections of changed data) in diff s output: 


$ diff -CO original_file modified_file | grep -c "*\*\*\*\*\*" 
2 


$ diff -CO original_file modified_file 
*** original_file Fri Nov 24 12:48:35 2006 
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--- modified file Fri Nov 24 12:48:43 2006 


kkkkkkkkkkkkkkk 


kkk 1 RKKK 


! This is original_file, and this line is different. 
E es 


! This is modified_file, and this line is different. 
kkkkkkkkkkkkkkk 


kkk 6 KKKK 


! But this one is different. 
sss 6° ee 
! But this 1 is different. 


If you only need to know whether the files are different and not how many 
differences there are, use cmp. It will exit at the first difference, which can 
save time on large files. Like diff, it is silent if the files are identical, but it 
reports the location of the first difference if not: 


$ cmp original_file modified_file 
original_file modified file differ: char 9, line 1 


Discussion 


Hunk is actually the technical term, though we’ve also seen hunks referred to 
as chunks in some places. Note that it is possible, in theory, to get slightly 
different results for the same files across different machines or versions of 
diff, since the number of hunks is a result of the algorithm diffuses. You will 
certainly get different answers when using different diff output formats, as 
demonstrated in the following examples. 


We find a zero-context contextual diff to be the easiest to use for this purpose, 
and using -CO instead of -c creates fewer lines for grep to have to search. A 
unified diff tends to combine more changes than expected into one hunk, 
leading to fewer differences being reported: 


$ diff -u original_file modified_file | grep -c "‘@Q@" 
1 


$ diff -u original_file modified_file 
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--- original_file 2006-11-24 12:48:35.000000000 -0500 
+++ modified file 2006-11-24 12:48:43.000000000 -0500 
@@ -1,8 +1,8 @@ 

-This is original_file, and this line is different. 
+This is modified_file, and this line is different. 

This line is the same. 

So is this one. 

And this one. 

Ditto. 

-But this one is different. 
+But this 1 is different. 

However, not this line. 

And this is the last same, same, same. 


A normal or ed-style diff works too, but the grep pattern is more complicated. 
Though not shown in this example, a multiline change in normal grep output 
might look like 2,3c2,3, thus requiring character classes and more typing 
than is the case using -CQ: 


$ diff -e original_file modified_file | egrep -c '“[[:digit:], ]+ 
[[:alpha:]]+' 
2 


$ diff original_file modified_file | egrep -c '*[[:digit:],]+ 
[[:alpha: ]]+' 
2 


$ diff original_file modified_file 
ici 
< This is original_file, and this line is different. 


> This is modified_file, and this line is different. 
6c6 
< But this one is different. 


> But this 1 is different. 


See Also 


= man diff 
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m man cmp 
m man grep 
a /Attp://en.wikipedia.org/wiki/Diff 


17.12 Removing or Renaming Files Named with 
Special Characters 


Problem 


You need to remove or rename a file that was created with a special character 
that causes rm or mv to behave in unexpected ways. The canonical example 
of this is any file starting with a dash, such as -for --help, which will cause 
any command you try to use to interpret the filename as an argument. 


Solution 


If the filename begins with a dash, use - - to signal the end of arguments to 
the command, or use a full (/tmp/-f) or relative (./-f) path. If the file 
contains other special characters that are interpreted by the shell, such as a 
space or asterisk, use shell quoting. If you use filename completion (the Tab 
key by default), it will automatically quote special characters for you. You 
can also use single quotes around the troublesome name: 


$ ls 
--help this is a *crazy* file name! 


S mv --help help 
mv: unknown option -- - 
usage: mv [-fiv] source target 
mv [-fiv] source ... directory 


S mv -- --help my_help 


$ mv this\ is\ a\ \*crazy\*\ file\ name\! this_is_a_better_name 
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$ ls 
my_help this_is_a_better_name 
Discussion 
To understand what is actually being executed after shell expansion, preface 


your command with echo: 


$ rm * 
rm: unknown option -- - 
usage: rm [-f|-i] [-dPRrvW] file ... 


$ echo rm * 
rm --help this is a *crazy* file name! 


You can also create a file named -i in a directory to prevent rm * from 
deleting all the files without asking first: 


$ mkdir del-test ; cd $_ 
$ > -i 

$ touch important_file 

$ u 

total 0 


-rw-r--r-- 1j 
-rw-r--r-- 1j 


O Jun 12 22:28 -i 


P ÍP 
p jp © Jun 12 22:28 important_file 


$ rm * 
rm: remove regular empty file ‘important_file'? n 


See Also 


= Question 11 in the GNU Core Utilities FAQ 
= Sections 2.1 and 2.2 of the Unix FAQs 
= Recipe 1.8, “Using Shell Quoting” 
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17.13 Prepending Data to a File 


Problem 


You want to prepend data to an existing file, for example to add a header 
after sorting. 


Solution 


Use cat in a subshell: 


temp_file="temp.SRANDOMSRANDOMSS" 
(echo 'static header linei'; cat data_file) > S$temp_file \ 


&& cat $temp_file > data file 


rm Stemp_file 
unset temp_file 


You could also use sed, the streaming editor. To prepend static text, note that 
back-slash escape sequences are expanded in GNU sed but not in some other 
versions. Also, under some shells the trailing backslashes may need to be 

doubled: 


# Any sed, e.g., Solaris 10 /usr/bin/sed 


$ 
> 
> 


sed -e '1i\ 
static header linet 
' data_file 


static header line1 


1 
2 
3 


$ 
> 
> 
> 


foo 
bar 
baz 


sed -e '1i\ 
static header line1\ 
static header line2 
' data_file 


static header line1 
static header line2 


1 
2 


foo 
bar 
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3 baz 


# GNU sed 

$ sed -e 'listatic header lLinei\nstatic header line2' data_file 
static header linet 

static header line2 

1 foo 

2 bar 

3 baz 


To prepend an existing file: 


$ sed -e '$r data_file' header_file 
Header Line1 

Header Line2 

1 foo 

2 bar 

3 baz 


Discussion 


This one seems to be a love/hate kind of thing. People either love the cat 
solution or love the sed solution, but not both. The cat version is probably 
faster and simpler; the sed solution is arguably more flexible. 


You can also store a sed script in a file, instead of leaving it on the command 
line. Of course, you would usually redirect the output into a new file, like sed 
-e '$r data' header > new_file, but note that will change the file’s 
inode and may change other attributes, such as permissions or ownership. To 
preserve everything but the inode, use -i for in-place editing if your version 
of sed supports that. Don’t use -i with the reversed header file prepend form 
shown previously, though, or you will edit your header file! Also note that 
Perl has a similar -i option that also writes a new file, though Perl itself 
works rather differently than sed for this example: 


# Show inode 
$ ls -i data_file 
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509951 data file 
$ sed -i -e 'listatic header Line1\nstatic header Line2' data_file 


$ cat data_file 

static header linet 

static header line2 

1 foo 

2 bar 

3 baz 

# Verify inode has changed 
$ ls -i data_file 

509954 data file 


To preserve everything (or if your sed does not have -i or you want to use 
the prepend file method mentioned earlier): 


# Show inode 
$ ls -i data_file 
509951 data file 


# SRANDOM is bash-only; you can use mktemp on other systems 
$ temp_file=SRANDOMSRANDOM 


$ sed -e 'Sr data_file' header_file > Stemp_file 


# Only cat if the source exists and is not empty! 
$ [ -s "Stemp_file" ] && cat Stemp_file > data 


$ unset temp_file 


$ cat data_file 
Header Line1 
Header Line2 

1 foo 

2 bar 

3 baz 


# Verify inode has NOT changed 


$ ls -i data_file 
509951 data 
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Prepending a header file to a datafile is interesting because it’s rather 
counterintuitive. If you try to read the header_file file into the data_file 
file at line one, you get this: 


$ sed -e '1r header_file' data_file 
1 foo 

Header Line1 

Header Line2 

2 bar 

3 baz 


So instead, we simply append the data to the header file and write the output 


to another file. Again, don’t try to use sed -i or you will edit your header 
file. 


Another way to prepend data is to use cat reading from STDIN with a here- 
document or a here-string. Note that here-strings are only available in bash 
2.05b or newer, and they don’t do backslash escape sequence expansion, but 
they avoid all the sed version issues: 


# Using a here-document 
$ cat - data_file <<EoH 
> Header linet 

> Header line2 

> EoH 

Header linet 

Header line2 

1 foo 

2 bar 

3 baz 


# Using a here-string in bash-2.05b+, no backslash escape sequence 
expansion 

$ cat - data_file <<<'Header Line1' 

Header Line1 

1 foo 

2 bar 

3 baz 
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See Also 


m man cat 

m man sed 

a /Attp://sed.sourceforge.net/sedfag.html 

a /ttp://sed.sourceforge.net/sed line. txt 

= /Attp://tldp.org/LDP/abs/html/x23170.html 

= Recipe 14.11, “Using Secure Temporary Files” 
= Recipe 17.14, “Editing a File in Place” 


17. 14 Editing a File in Place 


Problem 


You want to edit an existing file without affecting the inode or permissions. 


Solution 


This is trickier than it sounds because many tools you might ordinarily use, 
such as sed, will write to a new file (thus changing the inode) even if they go 
out of their way to preserve other attributes. 


The obvious solution is to simply edit the file and make your updates. 
However, we admit that that may be of limited use in a scripting situation. Or 
is it? 

In Recipe 17.13, you saw that sed writes a brand new file one way or another; 
however, there is an ancestor of sed that doesn’t do that. It’s called, 
anticlimactically, ed, and it is just as ubiquitous as its other famous 
descendant, vi. And interestingly, ed is scriptable. So here is our “prepend a 
header” example again, this time using ed: 


# Show inode 
$ ls -i data_file 
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306189 data file 


# Use printf "%b" to avoid issues with ‘echo -e' or not 

$ printf "%b" '1i\nHeader Line1\nHeader Line2\n.\nw\nq\n' | ed -s 
data_file 

1 foo 


$ cat data_file 
Header Line1 
Header Line2 

1 foo 

2 bar 

3 baz 


# Verify inode has NOT changed 


$ ls -i data_file 
306189 data file 


Discussion 


Of course you can store an ed script in a file, just as you can with sed. In this 


case, it might be useful to see what that file looks like, to explain the 
mechanics of the ed script: 


$ cat ed_script 
1i 

Header Line1 
Header Line2 


w 
q 


$ ed -s data_file < ed_script 
1 foo 


$ cat data_file 
Header Line1 
Header Line2 

1 foo 

2 bar 

3 baz 
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The 11 in the ed script means to go to the first line and then go into insert 
mode, and the next two lines are literal. A single . all by itself on a line exits 
insert mode, w writes the file, and q quits. The -s suppresses diagnostic 
output, specifically for use in scripts. 


One disadvantage to ed is that there isn’t that much documentation for it 
anymore. It’s been around since the beginning of Unix, but it’s not 
commonly used anymore even though it exists on every system we checked. 
Since both vi (via ex and sed (spiritually at least! are descended from ed, 
however, you should be able to figure out anything you might want to do. 
Note that ex is a symbolic link to vi or a variant on many systems, while ed is 
just ed. 


Another way to accomplish the same effect is to use sed or some other tool, 
write the changed file into a new file, then cat it back into the original file. 
This is obviously inefficient. It is also easier to say than to do safely because 
if the change fails for any reason you could end up writing nothing back over 
the original file (see the example in Recipe 17.13. 


See Also 


m man ed 

m man ex 

m ls -l which ex 

a /ttp://sed.sourceforge.net/sedfaq.html 

= Recipe 17.13, “Prepending Data to a File” 


17. 15 Using sudo on a Group of Commands 


Problem 


You are running as a regular user and need to sudo several commands at 
once, or you need to use redirection that applies to the commands and not to 
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sudo. 


Solution 


Use sudo to run a subshell in which you may group your commands and use 
pipe-lines and redirection: 


sudo bash -c 'command1 && command2 || command3' 


This requires the ability to run a shell as root. If you can’t, have your system 
administrator write a quick script and add it to your sudo privilege 
specification. 


Discussion 


If you try something like sudo command1 && command2 | | command3 yov’ll 
find that command2 and command3 are running as you, not as root. That’s 
because sudo’s influence only extends to the first command and your shell is 
doing the rest. That is, the sudo command only extends as far as the 
ampersands; the shell sees them as the separator between commands. 


Note the use of the -c argument to bash, which causes it to just execute the 
given commands and exit. Without that you will just end up running a new 
interactive root shell, which is probably not what you wanted. With -c you 
are still running a noninteractive root shell, and you need to have the sudo 
rights to do that. macOS and some Linux distributions, such as Ubuntu, 
actually disable the root user to encourage you to only log in as a normal user 
and sudo as needed (the Mac hides this better) for administration. If you are 
using an OS like that, or have rolled your own sudo setup, you should be fine. 
However, if you are running a locked-down environment, this recipe may not 
work for you. 


To learn whether you may use sudo and what you are and are not allowed to 
do, use sudo -l. Almost any other use of sudo will probably trigger a 
security message to your administrator tattling on you. You can try using 
sudo sudo -V | less as aregular user or just sudo -V | less if you are 
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already root to get a lot of information about how sudo is compiled and 
configured on your system. 


SU AND SUDO 


It’s always been a best practice to run as a regular user and only use root privileges 
when absolutely necessary. While the sw command is handy, many argue that sudo is 
better, for reasons such as the following: 


= It takes more work to get sudo working properly (in other words, locked down 
rather than just "ALL=(ALL) ALL") and it can be slightly less convenient to use, 
but it can also foster more secure work practices. 


=m You can forget that you have su’d to root and do something unfortunate. 


= Having to type sudo all the time makes you think about what you are doing a 
little more. 


= sudo allows delegation of individual commands to other users without sharing 
root’s password. 


= sudo can do everything su can, while the reverse is not true. 


Both commands can incorporate logging, and there are some tricks that can make each 
command work very much like the other; however, there are still some significant 
differences. One of the most important is that with sudo you enter your own password 
to confirm your identity before being allowed to execute a command. Thus, root s 
password is not shared if more than one person needs some root privileges. This 
brings us to the second difference: sudo can be very specific about what commands a 
given user can and cannot execute. That restriction can be tricky, since many 
applications allow you to shell out and do something else (so, if you are able to sudo 
into vi, you can shell out and have an unrestricted root prompt). Still, used carefully 
sudo is an excellent tool. 


See Also 


m man su 
m man sudo 
m man sudoers 


m man visudo 
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= sudo 

= /Attps://help.ubuntu.com/community/RootSudo 

= Recipe 14.15, “Writing setuid or setgid Scripts” 
= Recipe 14.18, “Running as a Non-root User” 

= Recipe 14.19, “Using sudo More Securely” 

= Recipe 14.20, “Using Passwords in Scripts” 


17. 16 Finding Lines That Appear in One File 
but Not in Another 


Problem 


You have two datafiles and you need to compare them and find lines that 
exist in one file but not in the other. 


Solution 


Sort the files and isolate the data of interest using cut or awk if necessary, and 
then use comm, diff, grep, or unig depending on your needs. 


comm is designed for just this type of problem: 


$ cat left 
record 01 

record 02.left only 
record 03 

record _05.differ 
record 06 

record 07 

record 08 

record 09 

record 10 


$ cat right 
record 01 
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record 02 

record 04 

record 05 

record _06.differ 
record 07 

record 08 
record_09.right only 
record 10 

# Only show Lines in the left file 
$ comm -23 left right 
record _02.left only 
record 03 

record _05.differ 
record 06 

record _09 


# Only show lines in the right file 
$ comm -13 left right 

record 02 

record 04 

record_05 

record _06.differ 

record _09.right only 


# Only show Lines common to both files 
$ comm -12 left right 

record 01 

record 07 

record 08 

record 10 


diff will quickly show you all the differences from both files, but its output is 
not terribly pretty and you may not need to know all the differences. GNU 
diff s -y and -W options can be handy for readability, but you can get used to 
the regular output as well: 


$ diff -y -W 60 left right 


record 01 record 01 
record 02.left only | record 02 
record 03 | record _04 
record _05.differ | record_05 
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record 06 
record 07 
record 08 
record 09 
record 10 


record _06.differ 
record 07 

record 08 

record _09.right only 
record 10 


$ diff -y -W 60 --suppress-common-Lines left right 


record 02 
record 03 
record_05 
record 06 
record 09 


. left only 


.differ 


$ diff left right 


2,5c2,5 


record 02 

record _04 

record 05 

record _06.differ 
record _09.right only 


< record 02 
< record 03 
< record 05 
< record 06 
> record 02 
> record 04 
> record_05 
> record 06 
8c8 

< record 09 


> record 09 


. left only 


differ 


differ 


.right only 


Some systems (e.g., Solaris) may use sdiff instead of diff -y or have a 
separate binary such as bdiff to process very large files. 


grep can show you when lines exist only in one file and not the other, and 
you can figure out which file if necessary. But since it’s doing regular 
expression matches, it will not be able to handle differences within the line 
unless you edit the file that becomes the pattern file, and it will also get very 


slow as the file sizes grow. 


This example shows all the lines that exist in the file /eft but not in the file 


right: 
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$ grep -vf right left 
record_03 
record 06 
record _09 


Note that only “record_03” is really missing; the other two lines are simply 
different. If you need to detect such variations, you’ll need to use diff. If you 
need to ignore them, use cut or awk as necessary to isolate the parts you need 
into temporary files. 


uniq -ucan show you only lines that are unique in the files, but it will not 
tell you which file the line came from (if you need to know that, use one of 


the previous solutions). uniq -d will show you only lines that exist in both 
files: 


$ sort right left | uniq -u 
record 02 

record 02.left only 
record 03 

record 04 

record_05 

record _05.differ 
record 06 

record _06.differ 
record 09 

record _09.right only 


$ sort right left | uniq -d 
record 01 
record 07 
record 08 
record 10 


Discussion 


comm is your best choice if it’s available and you don’t need the power of 
diff. 

You may need to sort and/or cut or awk into temporary files and work from 
those if you can’t disrupt the original files. 
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See Also 
= man cmp 


= man diff 


= man grep 


m Man uniq 


17.17 Keeping the Most Recent N Objects 


Problem 


You need to keep the most recent N logfiles or backup directories, and purge 
the remainder, no matter how many there are. 


Solution 


Create an ordered list of the objects, pass them as arguments to a function, 
shift the arguments by N, and return the remainder, as shown in Example 17- 


3. 


Example 17-3. ch17/func_shift_by 


# 


# HHH HH HH HH HH H HF OF 


cookbook filename: func_shift_by 


Pop a given number of items from the top of a stack, 

such that you can then perform an action on whatever is left. 
Called like: shift_by <# to keep> <ls command, or whatever> 
Returns: the remainder of the stack or list 


For example, list some objects, then keep only the top 10. 

It is CRITICAL that you pass the items in order with the objects to 
be removed at the top (or front) of the List, since all this function 
does is remove (pop) the number of entries you specify from the top 


of the list. 


You should experiment with echo before using rm! 
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# For example: 

# rm -rf $(shift_by SMAX_BUILD_ DIRS _TO_KEEP $(ls -rd backup. 2006*) ) 
# 

function shift_by { 


# If $1 is zero or greater than $#, the positional parameters are 
# not changed. In this case that is a BAD THING! 
if (( $1 == 0 || $1 > ( $# - 1 ) )); then 
echo '' 
else 
# Remove the given number of objects (plus 1) from the list. 
shift SCC S14 4 3) 
# Return whatever is left. 
echo "$*" 


WARNING 


If you try to shift the positional parameters by zero or by more than the total 
number of positional parameters ($#), shift will do nothing. If you are using 


shift to process a list then delete what it returns, that will result in you deleting 
everything. Make sure to test the argument to shift to make sure that it’s not 
zero and it is greater than the number of positional parameters. Our shift_by 
function does this. 


For example: 


$ source shift_by 
$ touch {1..9} 


$ ls ? 
123456789 


$ shift_by 3 $(ls ?) 
456789 


$ shift_by 5 $(ls ?) 
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6789 


$ shift_by 5 $(ls -r ?) 
4321 


$ shift_by 7 $(ls ?) 
8 9 


$ shift_by 9 $(ls ?) 


# Keep only the Last 5 objects 
$ echo "rm -rf $(shift_by 5 $(ls ?))" 
rm -rf 6789 


# In production we'd test this first! See discussion. 
$ rm -rf S$(shift_by 5 $(ls ?)) 


$ ls ? 
12345 


Discussion 


Make sure you fully test both the argument returned and what you intend to 
do with it. For example, if you are deleting old data, use echo to test the 
command that would be performed before doing it live. Also test that you 
have a value at all, or else you could end up doing rm -rf and getting an 
error. Never do something like rm -rf /Svariable, because if Svariable 
is ever null you will start deleting the root directory, which is particularly bad 
if you are running as root! 


Using the function in the solution to delete files in production might look like 
this: 


Sfiles_to_nuke=$(shift_by 5 $(ls ?)) 
[ -n $files_to_nuke ] && rm -rf "S$files_to_nuke" 


This recipe takes advantage of the fact that arguments to a function are 


affected by the shift command inside that function, which makes it trivial to 
pop objects off the stack (otherwise we’d have to do some fancy substring or 
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for loop operations. We must shift by n+1 because the first argument ($1 is 
actually the count of the items to shift, leaving $2. .N as the number of 
objects in the stack. We could also write it more verbosely this way: 


function shift_by { 
shift_count=$1 
shift 


shift Sshift_count 


echo "gx" 


It’s possible you may run afoul of your system’s ARG_MAX (see Recipe 15.13 
for details) if the paths to the objects are very long or you have a very large 
number of objects to handle. In the former case, you may be able to create 
some breathing room by changing directories closer to the objects to shorten 
the paths, or by using symbolic links. In the latter case, you can use this more 
complicated for loop: 


objects_to_keep=5 
counter=1 


for file in /path/with/many/many/files/*e*; do 
if [ Scounter -gt Sobjects_to_keep ]; then 
remainder="Sremainder $file" 
fi 
(( counter++ )) 
done 


[ -n "Sremainder" ] && echo "rm -rf $remainder" 


A common method of doing a similar operation is a trickle-down scheme 
such as the following: 


rm -rf backup.3/ 
mv backup.2/ backup.3/ 
mv backup.1/ backup.2/ 


683 


cp -al backup.0/ backup.1/ 


This works very well in many cases, especially when combined with hard 
links to conserve space while allowing multiple backups—see Hack #42 in 
Rob Flickenger’s Linux Server Hacks (O’ Reilly). However, if the number of 
existing objects fluctuates or is not known in advance, this method won’t 
work. 


See Also 

m help for 

m help shift 

= Linux Server Hacks by Rob Flickenger (O’Reilly), Hack #42 

= Recipe 13.5, “Parsing Output with a Function Call” 

= Recipe 15.13, “Working Around “Argument list too long” Errors” 
= Recipe 17.18, “Writing to a Circular Log” 


17.18 Writing to a Circular Log 


Problem 


You need to write datafiles and/or logs but you don’t want to spend too much 
effort purging them when they are obsolete. 


Solution 


Write the data into circular set of files or directories, such as days of the week 
or month, or months. You also need to have a way to clear the old data when 
you circle around again. 


Discussion 


This will only work if you have some well-defined series that can be circular, 
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such as hours of the day, days of the week, days of the month, or months. But 
it turns out that those cover a lot of ground. 


It helps to start with an example, so circular days of the week logfiles might 
look like this: 


1_Mon. log 
2_Tue. log 
3_Wed. log 
4_Thu. log 
5_Fri.log 
6_Sat. log 
7_Sun. log 


We use the slightly odd strftime format %u_%a to make the files sort in a 
human-readable way (yes, sort can handle days of the week, but /s can’t). 
Then all of Monday’s log messages go into 7 _Mon.log, and so on, and on 
Sunday at midnight we wrap around to Monday again. 


Typical formats include: 


$ printf "%(%u_%a)T"  # day of week 


2_Tue 
$ printf "%(%d)T" # day of month 
06 


$ printf "%(%m_%b)T" # month 
12_Dec 


The only tricky part is clearing out the data from last Monday before you 
start writing data for this Monday. If you have a log statement that is always 
the first to run on a new day, then have that statement truncate the output file 
using > instead of the >> you need to use to append everywhere else. But 
watch out for race conditions—it really has to be guaranteed to be the very 
first log line of the correct day. Perhaps a safer way is to use a cron job to 
delete tomorrow’s data a few minutes before midnight. There’s no race 
condition there, since you know the last time you wrote to that file was a 
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week (or whatever period ago, but there is a risk that if the cron job fails to 
run correctly that data will not be purged. 


Another way to do it is to have every call to the logging function delete the 
data for tomorrow. This is robust but inefficient, since most of the time there 
will be nothing to delete. It also reduces the window to N—1, since 
“tomorrow” is always deleted. 


For example: 


function mylog { 
local today tomorrow 


# Log for today 
printf -v today "%(%u_%a)T" 
echo "$*" >> SHOME/weekly_lLogs/Stoday.txt # e.g., 1_Mon 


# Purge data from tomorrow 
tomorrow=$(date -d 'tomorrow' '+%u_%a' ) 
rm -f SHOME/weekly_logs/Stomorrow.txt 


Note how we use both the bash builtin printf %(strftime format)T and 
the GNU date command with the very useful -d or - -date argument of 
tomorrow. Using printf is more efficient since bash already knows what time 
it is and there is no need for a subshell and external program, but that can’t 
tell you what tomorrow will be. 


Here are some example cron entries for a script that just keeps an eye on 
something: 


# Keep an eye on whatever it is every hour... 
06 * * * * /home/user/report/keep-an-eye-on-it.sh Oo 


# Keep weekly reports 
02 00 * * Mon Ln -fs "queue-report_$(date '+\%F').txt" 


/home/user /report/keep-an-eye-on-it.txt 12) 


# Start the day fresh (which means rolling 6-7 days...) 
03 00 * * * rm -f /home/user/report/S(date '+\%u_\%a')/* 8 
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@ Run the script every hour. 

@ Create a symlink like keep-an-eye-on-it.txt > keep-an-eye-on- 
it_2017-10-09.txt so when the script writes to keep-an-eye-on-it.txt 
output actually goes to a weekly keep-an-eye-on-it_2017-10-09. txt report 
you can archive. %F is a shortcut in some versions of date for %Y -%m-%d. 

@ Remove the contents of “tomorrow’s” directory, just before midnight. 
Note that in some versions of cron (e.g., Vixie-cron) you must escape % 
signs or you will get an error like “Syntax error: EOF in backquote 
substitution.” 

See Also 

m help printf 
= man date 
= Chapter 11 


— Recipe 11.10, “Logging with Dates” 
Recipe 17.19, “Circular Backups” 
Recipe 19.10, “Deleting Files Using an Empty Variable” 


17.19 Circular Backups 


Problem 


You need to back up some data but you don’t want to spend too much effort 
purging the backups when they are obsolete. 


Solution 


Write the backups into circular set of files or directories, such as days of the 
week or month, or months. You also need to have a way to clear the old data 
when you circle around again. 
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Discussion 


We’ ve found that every once in a while Firefox will lose its session restore 
feature, so we have a simple script to back up and restore that (Example 17- 
4. 


Example 17-4. ch17/ff-sessions 


#!/usr/bin/env bash 
# cookbook filename: ff-sessions 
# Save/Restore FF sessions 


# Run from cron Like: 
# 45 03,15 * * * opt/bin/ff-sess.sh qsave 


FF_DIR="SHOME/.mozilla/firefox" 
date=$(date '+%u_%a_%H') # e.g.: 3_Wed_15 


case "$i" in 
qsave ) # Quiet save 
cd $FF_DIR 
rm -f ff_sessions_ Sdate.zip 
zip -9qr ff_sessions_Sdate.zip */session* 


save ) # Noisy save (calls qsave) 
echo "SAVING 'SFF_DIR/*/session*' data into '$date' file" 
$0 qsave 
restore ) 
[ -z "$2" ] && { echo "Need a date to restore from!"; exit 1; } 
date="$2" 
echo "Restoring session data from 'Sdate' file" 
cd $FF_DIR 


unzip -o ff_sessions_$date.zip 


* 
) 
echo 'Save/Restore FF sessions' 
echo "$0 save" 
echo "$0 restore <date>" 
echo " e.g., $0 restore 3 Wed 15" 
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0 


ee 
33 


esac 


@ 


© 


Run from cron with a line like in the comment, in this case twice a day at 
3:45 a.m. and 3:45 p.m. 

As in Recipe 17.18, we prefix the human-readable day of the week with a 
number to make it sort correctly, then we add the hour at which the job 
ran. 

zip will normally append to a ZIP file, so we remove any existing file just 
in case you have added or removed a profile. The -f (force) option will 
prevent rm from generating an error if the file does not exist. 

We use -9 for maximum compression, -q for quiet, and -r for recursive 
zip operation, then we back up anything in the Firefox profile directories 
that starts with session. 

The “save” argument will display a message about what it’s doing. 

Then it will call the “quiet” save. Normally for cron jobs you only want 
output if something went wrong; otherwise you get an email every time 
the job runs. 

We’ve compressed what might otherwise be several lines into one line 
here because, while the sanity check is important, we don’t want to 
distract from the main point of the block. 

We assign $2 to $date for later code clarity. This may seem silly in so 
small a block, but it’s generally a good practice to follow and it’s better to 
be consistent and not waste time thinking, “Should I?” 

We use -o for unzip to overwrite the existing files, if any, so we’re not 
prompted about that. 

Finally, if we provide no options or the wrong ones, we get a helpful 
reminder about usage. 


This script can easily be extended to save weekly, monthly, and yearly 
backups by either adding more options or changing the script to take an 
argument instead of hardcoding “now” as we did, then adding more cron jobs 
with the appropriate arguments. Note that in some versions of cron (e.g., 


Vixie-cron) you must escape % signs or you will get an error like “Syntax 
error: EOF in backquote substitution.” 
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See Also 


m man zip 

m man unzip 

u http://bit. ly/ZrLpRn 

= http:/kb.mozillazine.org/Session_Restore (“Troubleshooting”) 
€ /Attps://wiki.mozilla.org/Session_Restore 


m Recipe 17.18, “Writing to a Circular Log” 


17.20 Grepping ps Output Without Also Getting 
the grep Process Itself 


Problem 


You want to grep output from the ps command without also getting the grep 
process itself. 


Solution 


Change the pattern you are looking for so that it is a valid regular expression 
that will not match the literal text that ps will display: 


$ ps aux | grep 'ssh' 

root 366 0.0 1.2 340 1588 ?? Is 200ct06 
root 25358 0.0 1.9 472 2404 ?? Ss Wed07PM 
root@ttypd 

jp 27579 0.0 0.4 152 540 pO S+ 3:24PM 0:00.04 grep ssh 
$ ps aux | grep '[s]sh' 

root 366 0.0 1.2 340 1588 ?? Is 200ct06 
root 25358 0.0 1.9 472 2404 ?? Ss Wed07PM 
root@ttypdo 

$ 


0.68 /usr/sbin/sshd 
2. 


0:0 
0:02.16 sshd: 


0.68 /usr/sbin/sshd 
2. 


0:0 
0:02.17 sshd: 
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Discussion 


This works because [s] is a regular expression character class containing a 
single lowercase letter s, meaning that [s]sh will match ssh but not the 
literal string grep [s]sh that ps will display. 


The other (less efficient and more clunky solution you might see is 
something like this: 


ps aux | grep 'ssh' | grep -v grep 


See Also 


m man ps 
= man pgrep 


m man grep 


17.21 Finding Out Whether a Process Is Running 


Problem 


You need to determine whether a process is running, and you might or might 
not already have a process ID (PID). 


Solution 


If you don’t already have a PID, grep the output of the ps command to see if 
the program you are looking for is running (see Recipe 17.20 for details on 
why our pattern is [s ]sh): 


ps -ef | grep -q 'bin/[s]shd' && echo 'ssh is running' || echo 'ssh not 
running' 


That’s nice, but you know it’s not going to be that easy, right? Right. It’s 
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difficult because ps can be wildly different from system to system. 


Example 17-5 is a script you can use to find out if a process is running if you 
don’t have a PID. 

Example 17-5. ch17/is_process_running 

# cookbook filename: is_process_running 

# Can you believe this?!? 


case ‘uname’ in 
Linux|AIX) PS_ARGS='-ewwo pid,args' HR 


SunOS ) PS _ARGS='-eo pid,args' Bee 

*BSD ) PS_ARGS='axwwo pid,args' ee 

Darwin) PS_ARGS='Awwo pid,command' ;; 
esac 


if ps $PS_ARGS | grep -q 'bin/[s]shd'; then 

echo 'sshd is running' 
else 

echo 'sshd not running' 
fi 
If you do have a PID, say from a lockfile or an environment variable, just 
search for it (be careful to match the PID up with some other recognizable 
string so that you don’t have a collision where some other random process 
just happens to have a stale PID that matches the one you are using). Use the 


PID in the grep or in a -p argument to ps: 


# Linux 
$ ps -wwo pid,args -p 1394 | grep 'bin/sshd' 
1394 /usr/sbin/sshd 


# BSD 
$ ps w -p 366 | grep 'bin/sshd' 
366 ?? Is 0:00.76 /usr/sbin/sshd 


If your system has pgrep installed, you can use that too. It has many options, 
but we’re only using -f to search the full command line instead of just the 
process name, and -a to display the full command line: 
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$ pgrep -fa 'bin/[s]shd' ; echo $? 
1278 /usr/sbin/sshd -D 


Discussion 


The test and grep portion of the first solution requires a little explanation. 
You need the " " around the $() so that if grep outputs anything, the test is 
true. If the grep is silent because nothing matches, then the test is false. You 
just have to make sure your ps and greps do exactly what you want. 


Unfortunately, the ps command is one of the most fragmented in all of Unix. 
It seems like every flavor of Unix and Linux has different arguments and 
processes them in different ways. All we can tell you is that you’ll need to 
thoroughly test against all systems on which your script will be running. 


You can easily search for anything you can express as a regular expression, 
but make sure your expressions are specific enough not to match anything 
else. That’s why we used bin/[s]shd instead of just [s]shd, which would 
also match user connections (see Recipe 17.20). At the same time, 
/usr/sbin/[s]shd might be bad in case some crazy system doesn’t use that 
location. There is often a fine line between too much and not enough 
specificity. For example, you may have a program that can run multiple 
instances using different configuration files, so make sure you search for the 
config file as well if you need to isolate the correct instance. The same thing 
may apply to users, if you are running with enough rights to see other users’ 
processes. 


WARNING 


In versions of Solaris older than 11.3 SRU 5, ps was hardcoded to limit 
arguments to only 80 characters. If you have long paths or commands and still 
need to check for a config filename, you may run into that limit. 


See Also 


m man ps 
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= man grep 


= Recipe 17.20, “Grepping ps Output Without Also Getting the grep Process 
Itself” 


m= man pgrep 
= man pidof 
= man killall 


€ /Attps://blogs.oracle.com/casper/solaris-1 13-sru-56:-updates-in-ps l-and- 
procltpidgtcmdline, environ,execname 


17.22 Adding a Prefix or Suffix to Output 


Problem 


You’d like to add a prefix or a suffix to each line of output from a given 
command for some reason. For example, you’re collecting /ast statistics from 
many machines, and it’s much easier to grep or otherwise parse the data you 
collect if each line contains the hostname. 


Solution 


Pipe the appropriate data into awhile read loop and printf as needed. For 
example, this prints the SHOSTNAME, followed by a tab, followed by any 
nonblank lines of output from the /ast command: 


last | while read i; do [[ -n "Si" ]] && printf "%b" "SHOSTNAME\tSi\n"; 
done 


Or you can use awk to add text to each line: 
last | awk "BEGIN { OFS=\"\t\" } ! /4\$/ { print \"SHOSTNAME\", \$0}" 


Or, to write a new logfile, use: 


694 


last | while read i; do [[ -n "Si" ]] && printf "%b" "SHOSTNAME\tS$i\n"; 


\ 
done > Last_SHOSTNAME. log 


or: 


last | awk "BEGIN { OFS=\"\t\" } ! /^\$/ { print \"SHOSTNAME\", \$0}" \ 
> Last_SHOSTNAME. log 


Discussion 


We use [[ -n "$i" ]] to remove any blank lines from the /ast output, and 
then we use printf to display the data. Quoting for this method is simpler, but 
it uses more steps (/ast, whi le, and read, as opposed to just last and awk). 
You may find one method easier to remember, more readable, or faster than 
the other, depending on your needs. 


There is a trick to the awk command we used here. Often you will see single 
quotes surrounding awk commands to prevent the shell from interpreting awk 
variables as shell variables. However, in this case we want the shell to 
interpolate SHOSTNAME, so we surround the command with double quotes. 
That requires us to use backslash escapes on the elements of the command 
that we do not want the shell to handle, namely the internal double quotes, 
the $ end-of-line anchor, and the awk $0 variable, which contains the current 
line. 


For a suffix, simply move the $0 variable: 


last | while read i; do [[ -n "Si" ]] && printf "%b" "Si\tSHOSTNAME\n"; 
done 


or with awk: 
last | awk "BEGIN { OFS=\"\t\" } ! /^\$/ { print \$0, \"SHOSTNAME\"}" 


You could also use Perl: 
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last | perl -ne "print qq(SHOSTNAME\t\$_) if ! /4\s*$/;" 


or sed (note the — denotes a literal tab character, typed by pressing Ctrl-V 
then Ctrl-I): 


last | sed "s/./SHOSTNAME > &/; /*$/d" 


In the Perl command, we use qq() instead of double quotes to avoid having 
to escape the parts of the command we don’t want the shell to interpret. The 
last part is a regular expression that matches a line containing either nothing 
or only whitespace, and $_ is the Perl idiom for the current line. In the sed 
command we replace any line containing at least one character with the prefix 
and the character that matched (&), then delete any blank lines. 


See Also 

= Effective awk Programming, 4th Edition, by Arnold Robbins 

=m sed & awk, 2nd Edition, by Arnold Robbins and Dale Dougherty 
= Recipe 1.8, “Using Shell Quoting” 

= Recipe 13.15, “Trimming Whitespace” 

= Recipe 13.18, “Processing Files with No Line Breaks” 


17.23 Numbering Lines 


Problem 


You need to number the lines of a text file for reference or for use as an 
example. 


Solution 


Thanks to Michael Wang for contributing the following shell-only 
implementation and reminding us about cat -n. Note that our sample file 
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named /ines has a trailing blank line: 


$ i=0; while IFS= read -r line; do (( i++ )); echo "Si $line"; done < 
lines 

1 Line 1 

Line 2 


Line 4 
Line 5 


nu AWN 


Or a useful use of cat: 


$ cat -n lines 
Line 1 
Line 2 


Line 4 
Line 5 


DNnUuUBRWYN BP 


$ cat -b lines 
1 Line 1 
2 Line 2 
3 Line 4 
4 Line 5 


Discussion 
If you only need to display the line numbers on the screen, you can use Less 


-N: 


$ /usr/bin/less -N filename 
1 Line 1 
Line 2 


Line 4 


2 
3 
4 
5 Line 5 


6 
Lines (END) 
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WARNING | 


Line numbers are broken in old versions of less on some obsolete Red Hat 
systems. Check your version with Less -V. Version 358+1s0254 (e.g., Red Hat 
7.3 & 8.0) is known to be bad. Version 378+1s0254 (e.g., RHEL3) and version 
382 (RHEL4, Debian Sarge) are known to be good; we did not test other 
versions. The problem is subtle and may be related to an older iso256 patch. 
You can easily compare last line numbers as the vi and Perl examples are 
correct. 


You can also use vi (or view, which is read-only vi) with the :set nu! 
command: 


$ vi filename 
1 Line 1 
Line 2 


2 

3 

4 Line 4 
5 Line 5 
6 


~ 


:set nu! 

'set 
nu!' filename to turn on line numbering and place your cursor on line 3. If 
you’d like more control over how the numbers are displayed, you can also 
use nl, awk, or perl: 


vi has many options, so you can start vi by doing things like vi +3 -c 


$ nl lines 
1 Line 1 
2 Line 2 


3 Line 4 
4 Line 5 


$ nl -ba lines 


1 Line 1 
2 Line 2 
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3 
4 Line 4 
5 Line 5 
6 


awk '{ print NR, $0 }' filename 
Line 1 
Line 2 


Line 4 
Line 5 


NDnuUBWNPR YH 


perl -ne 'print qq($.\t$_);' filename 
> Line 1 
> Line 2 


Line 4 
Line 5 


NnuUBRWNPR YH 


t y yy 


NR and $. are the line number in the current input file in awk and Perl 
respectively, so it’s easy to use them to print the line number. Note that we 
are using a — to denote a tab character in the Perl output, while awk uses a 
space by default. 


See Also 


= man cat 
= man nl 

= man awk 
m man less 
= man vi 


= Recipe 8.15, “Doing More with less” 
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17.24 Writing Sequences 


Problem 


You need to generate a sequence of numbers, possibly with other text, for 
testing or some other purpose. 


Solution 


Use awk because it should work everywhere no matter what: 


awk "END { for (i=1; i <= 5; i++) print i, "text"}' /dev/null 
text 
text 
text 
text 
text 


MWBBWN PW 


awk 'BEGIN { for (i=1; i <= 5; i+=.5) print i}' /dev/null 
5 


sa 


OABBRWWDNYYN PE PY 


Discussion 


On some systems, notably Solaris, awk will hang waiting for a file unless you 
give it one, such as /dev/null. This has no effect on other systems, so it’s fine 
to use everywhere. 


Note that the variable in the print statement is i, not $i. If you accidentally 
use $i it will be interpolated as a field from the current line being processed. 
Since we’re processing nothing, that’s what you’ll get if you use $1 by 
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accident (1.e., nothing. 


The BEGIN and END patterns allow for startup or cleanup operations when 
actually processing files. Since we’re not processing a file, we need to use 
one of them so that awk knows to actually do something even though it has 
no normal input. In this case, it doesn’t matter which we use. 


There is a GNU utility called seg that does exactly what this recipe calls for, 
but it does not exist by default on many systems (for example, Solaris and 
older macOS and BSDs. It offers some useful formatting options and is 
numeric only, but be aware that you may find differences between the BSD 
and GNU versions. 


Thankfully, as of bash 2.04 and later, you can do integer arithmetic in for 
loops: 


Bash 2.04+ only, integer only 

for ((i=1; i<=5; i++)); do echo "$i text"; done 
text 

text 

text 

text 

text 


WO BWDNY PY $ 


As of bash 3.0 there is also the {x. . y} brace expansion, which allows 
integers or single characters: 


Bash 3.0+ only, integer or single character only 
printf "%s text\n" {1..5} 

text 

text 

text 

text 

text 


WO BWNRP YN # 


printf "%s text\n" {a..e} 
text 
text 
text 
text 


aA TDV 
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e text 


In bash 4.0 and later, you may use leading zeros in the {x. . y} brace 
expansion: 


# Bash 4.0+ only, optional Leading zeros with integers 
$ for num in {01..16}; do echo ssh serverSnum; done 
ssh server@1 

ssh server@2 

ssh server@3 


ssh server14 


ssh server15 
ssh server16 


See Also 


m man seq 
= man awk 


= /Attp://www.faqs.org/faqs/computer-lang/awk/faq/ 


17.25 Emulating the DOS Pause Command 


Problem 


You are migrating from DOS/Windows batch files and want to emulate the 
DOS pause command. 


Solution 


To do that, use the read -n1 -p command in a function: 


pause () 


i 
} 


read -n1 -p 'Press any key when ready...' 


702 


WARNING 


-n was introduced in bash 2.04. If you must omit the -n1 (though really, if 
you’re using a bash that old you should upgrade it), then the prompt as shown is 
not correct, because you must end the input by hitting the Enter key. You 
should use something like this instead: read -p Press the ENTER key when 
ready.... 


Discussion 


The -nachars option will return after reading nchars, or a newline. So, -n1 
returns after (wait for it...) any key. The -p option followed by a string 
argument prints the string before reading input. In this case the string is the 
same as the DOS pause command’s output. 


See Also 


= help read 
= Recipe 1.12, “Keeping bash Updated” 


17.26 Commifying Numbers 


Problem 


You'd like to add a thousands-place separator to long numbers. 


Solution 


Depending on your system and configuration, you may be able to use printf s 
format flag with a suitable locale. Thanks to Chet Ramey for this solution, 
which is by far the easiest if it works: 


$ LC_NUMERIC=en_US.UTF-8 printf "%'d\n" 123456789 
123,456,789 
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$ LC_NUMERIC=en_US.UTF-8 printf "%'f\n" 123456789 .987 
123,456,789.987000 
$ 


Thanks to Michael Wang for contributing the shell-only implementation and 
the relevant discussion. 


Example 17-6. ch17/func_commify 


# cookbook filename: func_commify 


function commify { 
typeset text=${1} 


typeset bdot=${text%%.*} 
typeset adot=${text#${bdot}} 


typeset i commified 
(( i = ${#bdot} - 1 )) 


while (( i>=3 )) && [[ ${bdot:i-3:1} == [0-9] ]]; do 
commified=",${bdot:i-2:3}${commified}" 
(( i -= 3 )) 
done 
echo "${bdot:0:i+1}${commified}${adot}" 
} 


Or you can try one of the sed solutions from the sed FAQ. For example: 
sed ':a;s/\B[0-9]\{3\}\>/,&/;ta' /path/to/file # 
GNU sed 
sed -e :a -e 's/\(.*[0-9]\)\C([0-9]\{3\}\)/\1,\2/;ta' /path/to/file # 


other seds 


Discussion 


The shell function is written to follow the same logical process as a person 
using a pencil and paper. First you examine the string and find the decimal 
point, if any. You ignore everything after the dot, and work on the string 
before the dot. 


The shell function saves the string before the dot in $bdot, and after the dot 
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(including the dot in $adot. If there is no dot, then everything is in $bdot, 
and Sadot is empty. Next, a person would move from right to left in the part 
before the dot and insert a comma when these two conditions are met: 


= There are four or more characters left. 
= The character before the comma is a number. 
The function implements this logic in the while loop. 


Recipe 2.16 in Tom Christiansen and Nathan Torkington’s Perl Cookbook, 
2nd Edition (O’Reilly) also provides a string processing solution, reproduced 
in Example 17-7. 


Example 17-7. ch17/perl_sub_commify 


# cookbook filename: perl_sub_commify 
HEEEELEELEELELEEEEELEEEPEEEEEPEEEEEEEEEEEEEEPEEEEEEEEEEEEEEEEEEEEEET EP EEE EHS 


# Add comma thousands separator to numbers 
# Returns: input string, with any numbers commified 
# From Perl Cookbook2 2.16, pg 84 
sub commify { 
@ == 1 or carp ('Sub usage: Swithcomma = commify(Ssomenumber);'); 


# From _Perl_Cookbook_1 2.17, pg 64, or _Perl_Cookbook_2 2.16, pg 84 
my $text = reverse $ [0]; 
$text =~ s/(\d\d\d)(?=\d)(?!\d*\.)/$1,/9; 


return scalar reverse $text; 


} 
a 
The United States uses a comma as the thousands separator, but many other 
countries use a period. 
See Also 


= Section 4.14 of the sed FAQ 
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= Perl Cookbook, 2nd Edition, Recipe 2.16, by Tom Christiansen and 
Nathan Torkington (O’ Reilly) 
= Recipe 13.19, “Converting a Datafile to CSV” 


! See http://www.columbia.edu/~rh120/ch106.x09. 
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Chapter 18. Working Faster by 
Typing Less 


Despite all the improvements in processor speed, transmission rates, network 
speed, and I/O capabilities, there is still a limiting factor in many uses of bash 
—the typing speed of the user. Scripting has been our focus, of course, but 
interactive use of bash is still a significant part of its use and usefulness. 
Many of the scripting techniques we have described can be used interactively 
as well, but then you find yourself faced with a lot of typing, unless you 
know some shortcuts. 


“Back in the day,” when Unix was first invented, there were teletype 
machines that could only crank out about 10 characters per second, and a 
good touch typist could type faster than the keyboard could handle it. It was 
in this milieu that Unix was developed, and some of its terseness is likely due 
to the fact that no one wanted to type more than absolutely necessary to get 
their commands across. 


At the other end of the historical perspective (1.e., now), processors are so fast 
that they can be quite idle while waiting for user input, and can look back 
through histories of previous commands as well as in directories along your 
SPATH to find possible commands and valid arguments even before you finish 
typing them. 

Combining techniques developed for each of these situations, we can greatly 
reduce the amount of typing required to issue shell commands—and not just 
out of sheer laziness. Rather, you’re likely to find these keystroke-saving 
measures useful because of the increased accuracy they provide, the mistakes 
they help you avoid, and the backups that you don’t need to reload. 


18. 1 Moving Quickly Among Arbitrary 
Directories 


707 


Problem 


You find yourself moving frequently between two or more directories cd’ ing 
here, then there, and then back again. The directories never seem to be close 
by, and you’re tired of always typing long pathnames. 


Solution 


Use the pushd and popd builtin commands to manage a stack of directory 
locations, and to switch between them easily. Here is a simple example: 


$ cd /tmp/tank 


$ pwd 
/tmp/tank 


$ pushd /var/log/cups 
/var/log/cups /tmp/tank 


$ pwd 
/var/log/cups 


$ ls 
access _log error_log page_log 


$ popd 
/tmp/tank 


$ ls 
empty full 


$ pushd /var/log/cups 
/var/log/cups /tmp/tank 


$ pushd 
/tmp/tank /var/log/cups 


$ pushd 
/var/log/cups /tmp/tank 


$ pushd 
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/tmp/tank /var/log/cups 


$ dirs 
/tmp/tank /var/log/cups 


Discussion 


Stacks are last in, first out mechanisms, which is how these commands 
behave. When you pushd to a new directory, it keeps the previous directory 
on a stack. Then when you popd, it pops the current location off of the stack 
and puts you back in that first location. When you change locations using 
these commands, they will print the values on the stack, left to right, 
corresponding to the top-to-bottom ordering of the stack. 


If you use pushd without specifying a directory, it swaps the top item on the 
stack with the next one down, so that you can alternate between two 
directories using repeated pushd commands with no arguments. You can do 
the same thing using the cd - command. 


You can still cd to locations—that will change the current directory, which is 
also the top of the directory stack. If you can’t remember what is on your 
stack of directories, use the dirs builtin command to echo the stack, left-to- 
right. For a more stack-like display, use the -v option: 


$ dirs -v 
0 /var/tmp 
1 ~/part/me/scratch 
2 /tmp 

$ 


The tilde (~) is a shorthand for your home directory. The numbers can be 
used to reorder the stack. If you pushd +2, then bash will put the #2 entry on 
the top of the stack (and cd you there) and push the others down: 


$ pushd +2 

/tmp /var/tmp ~/part/me/scratch 
$ dirs -v 

0 /tmp 
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1 /var/tmp 
2 ~/part/me/scratch 


If you want that stack-like listing of directories, but without the numbers, use 
the -p option: 


$ dirs -p 

/tmp 

/var/tmp 
~/part/me/scratch 
$ 


Once you get a little practice with these commands, you will find it much 
faster and easier to move repeatedly between directories. 


See Also 

= Recipe 1.4, “Showing Where You Are” 

= Recipe 14.3, “Setting a Secure $PATH” 

= Recipe 16.6, “Setting Your $CDPATH” 

= Recipe 16.15, “Creating a Better cd Command” 

= Recipe 16.22, “Getting Started with a Custom Configuration” 


18.2 Repeating the Last Command 


Problem 


You just typed a long and difficult command line, one with long pathnames 
and complicated sets of arguments. Now you need to run it again. Do you 
have to type it all again? 


Solution 
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There are two very different solutions to this problem. First, just type two 


exclamation marks at the prompt, and bash will echo and repeat the previous 
command. For example: 


$ /usr/bin/somewhere/someprog -g -H -yknot -w /tmp/soforthandsoon 

aT 

/usr/bin/somewhere/someprog -g -H -yknot -w /tmp/soforthandsoon 

$ 
The other (more modern) solution involves using the arrow keys. Pressing the 
up-arrow key will scroll back through the previous commands that you have 


issued. When you find the one you want, just press the Enter key and that 
command will be run (again). 


Description 


The command is echoed when you type !! (sometimes called bang bang) so 
that you can see what is running. 


WARNING 


Chet tells us that the csh-style bang history may no longer be enabled by default 
in a future version of bash because he has had multiple requests to turn that off. 
It would still be available as an option, however. 


See Also 


= Recipe 16.10, “Adjusting readline Behavior Using .inputrc”’ 
= Recipe 16.14, “Setting Shell History Options” 


= Recipe 18.3, “Running Almost the Same Command” 


18.3 Running Almost the Same Command 
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Problem 


After running a long and difficult-to-type command, you got an error 
message indicating that you’d made one tiny little typo in the middle of that 
command line. Do you have to retype the whole line? 


Solution 


The !! command that we discussed in Recipe 18.2 allows you to add an 
editing qualifier. How good are your sed-like skills? Add a colon after the 
bang bang and then a sed-like substitution expression, as in the following 
example: 


$ /usr/bin/somewhere/someprog -g -H -yknot -w /tmp/soforthandsoon 
Error: -H not recognized. Did you mean -A? 
$ !!:s/H/A/ 


/usr/bin/somewhere/someprog -g -A -yknot -w /tmp/soforthandsoon 


$ 


You can always just use the arrow keys to navigate your history and 
commands, as described in the previous recipe, but for long commands on 
slow links this syntax is great once you get used to it. 


Discussion 


If you’re going to use this feature, be careful with your substitutions. If you 
had tried to change the -g option by typing !!:s/g/h/ you would have ended 
up changing the first letter g, which is at the end of the command name, and 
you would be trying to run /usr/bin/somewhere/someproh. 


If you want to change all occurrences of an expression in a command line, 
you need to precede the s with a g (for global substitution), as follows: 


$ /usr/bin/somewhere/someprog -g -s -yknots -w /tmp/soforthandsoon 


$ !!:gs/s/S/ 
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/usr/bin/Somewhere/Someprog -g -S -yknotS -w /tmp/SoforthandSoon 


$ 


Why does this g have to appear before the s and not after it, like in sed 
syntax? Well, anything that appears after the closing slash will be considered 
new text to append to the command—which is quite handy if you want to add 
another argument to the command when you run it again. 


See Also 


= Recipe 16.10, “Adjusting readline Behavior Using .inputrc” 
= Recipe 16.14, “Setting Shell History Options” 
= Recipe 18.2, “Repeating the Last Command” 


18.4 Quick Substitution 


Problem 


You'd like to know if there’s a simpler syntax for making substitutions in 
your previously executed command and running the modified result. 


Solution 

Use the caret (^) substitution mechanism: 
$ /usr/bin/somewhere/someprog -g -A -yknot -w /tmp/soforthandsoon 
$ ^-g -A^-gB^ 


/usr/bin/somewhere/someprog -gB -yknot -w /tmp/soforthandsoon 


You can always just use the arrow keys to navigate your history and 
commands, but for long commands on slow links this syntax is great once 
you get used to it. 
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Discussion 


Write the substitution on the command line by starting with a caret (^ and 
then the text you want replaced, then another caret and the new text. A 
trailing (third caret is needed only if you want to add more text at the end of 
the line, as in: 


$ /usr/bin/somewhere/someprog -g -A -yknot 


$ ^-g -A^-gB^ /tmp^ 
/usr/bin/somewhere/someprog -gB -yknot /tmp 


If you want to remove something, substitute an empty value; i.e., don’t put 
anything for the new text. Here are two examples: 


$ /usr/bin/somewhere/someprog -g -A -yknot /tmp 


$ Ang =A” 
/usr/bin/somewhere/someprog -yknot /tmp 


$ Aknot’ 
/usr/bin/somewhere/someprog -gA -y /tmp 


$ 


The first example uses all three carets. The second example leaves off the 
third caret; since we want to replace the “knot” with nothing, we just end the 
line with a newline (the Enter key). 


The use of caret substitution is just plain handy. Many bash users find it 


easier to use than the !!:s/../../ syntax demonstrated in Recipe 18.3. What 
do you think? 


See Also 


= Recipe 16.10, “Adjusting readline Behavior Using .inputrc” 
= Recipe 16.14, “Setting Shell History Options” 
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= Recipe 18.3, “Running Almost the Same Command” 


18.5 Reusing Arguments 


Problem 


Reusing the last command is easy with !!, but you don’t always want the 
whole command. How can you reuse just the last argument? 


Solution 


Use !$ to indicate the last argument of the preceding command. Use !:1 for 
the first argument on the command line, ! :2 for the second, and so on. 


Discussion 


It is quite common to hand the same filename to a series of commands. One 
of the most common occurrences might be the way a programmer would edit 
and then compile, edit and then compile.... Here, the !$ comes in quite 
handy: 


$ vi /some/Long/path/name/you/only/type/once 


$ gcc !$ 
gcc /some/long/path/name/you/only/type/once 


$ vi !$ 
vi /some/long/path/name/you/only/type/once 


$ gcc !$ 
gcc /some/long/path/name/you/only/type/once 


$ 


Get the idea? It saves a lot of typing, but it also avoids errors. If you mistype 
the filename when you compile, then you are not compiling the file that you 
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just edited. With !$ you always get the name of the file on which you just 
worked. If the argument you want is buried in the middle of the command 
line, you can get at it with the numbered “bang-colon” commands. Here’s an 
example: 


$ munge /opt/my/long/path/toa/file | more 


$ vt !:1 
vi /opt/my/long/path/toa/file 


$ 


You might be tempted to try to use !$, but in this instance it would yield 
more, which is not the name of the file that you want to edit. 


See Also 


= The bash manpage on “Word Designators” 


= Recipe 18.2, “Repeating the Last Command” 


18.6 Finishing Names for You 


Problem 


Sometimes pathnames get pretty long. This is a computer that bash is running 
on... can’t it help? 


Solution 


When in doubt, press the Tab key. bash will try to finish the pathname for 
you. If it does nothing, it may be because there are no matches, or because 
there is more than one. Press the Tab key a second time and it will list the 
choices and then repeat the command up to where you stopped typing, so that 
you can continue. Type a bit more (to disambiguate), then press the Tab key 
again to have bash finish off the argument for you. 
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Discussion 


bash is even smart enough to limit the selection to certain types of files. If 
you type unzip and then the beginning of a pathname, and then you press the 
Tab key, it will only finish off with files that end in .zip even if you have 
other files whose names match as much as you have typed. For example: 


$ ls 

myfile.c myfile.o myfile.zip 

$ ls -lh myfile<tab><tab> 

myfile.c myfile.o myfile.zip 

$ ls -lh myfile.z<tab>ip 

-fW-r--r-- 1 me mygroup 1.9M 2006-06-06 23:26 myfile.zip 
$ unzip -l myfile<tab>.zip 


$ 


WARNING 


That last example for unzip requires the bash-completion package we discussed 
in Recipe 16.19. If bash is not able to offer completion suggestions, make sure 
that package is installed. 


See Also 
= Recipe 16.10, “Adjusting readline Behavior Using .inputrc”’ 


= Recipe 16.19, “Improving Programmable Completion” 


18.7 Playing It Safe 


Problem 


It’s so easy to type the wrong character by mistrake (see!). Even for simple 
bash commands this can be quite serious—you could move or remove the 
wrong files. When pattern matching is added to the mix, the results can be 
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even more exciting, as a typo in the pattern can lead to wildly different-than- 
intended consequences. What’s a conscientious person to do? 


Solution 


You can use these history features and keyboard shortcuts to repeat 
arguments without retyping them, thereby reducing the chance of typos. If 
you need a tricky pattern match for files, try it out with echo to see that it 
works, and then when you’ ve got it right use !$ to use it for real. For 
example: 


$ ls 

ab1.txt aci.txt jbi.txt wc3.txt 
$ echo *1.txt 

ab1.txt aci.txt jb1.txt 
$ echo [aj]?1.txt 
ab1.txt aci.txt jb1.txt 
$ echo ?b1.txt 

ab1.txt jb1.txt 

$ rm !$ 

rm ?b1.txt 

$ 


Discussion 


echo is a way to see the results of your pattern match. Once you’re convinced 
it gives you what you want, then you can use it for your intended command. 
Here we removed the named files—not something that one wants to get 
wrong. 


Also, when you’re using the history commands, you can add a :p modifier 
and it will cause bash to print but not execute the command—another handy 
way to see if you got your history substitutions right. At the end of the 
example in the Solution section, we could have done this: 


$ echo ?b1.txt 
ab1.txt jb1.txt 
$ rm !S:p 
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rm ?b1.txt 
$ 


The :p modifier causes bash to print but not execute the command—but 
notice that the argument is ?b1. txt and is not expanded to the two 
filenames. This option shows you what will be run, but only when it is run 
will the shell expand that pattern to the two filenames. If you want to see how 
it will be expanded, use the echo command. 


See Also 


= The bash manpage on “Modifiers” for more colon (:) modifiers that can 
be used on history commands 


= “Command-Line Processing Steps” in Appendix C 


m Recipe 18.5, “Reusing Arguments” 


18.8 Big Changes, More Lines 


Problem 


What if the changes that need to be made are too complicated for a single 

substitution or span several command lines? Sometimes you find yourself 
running the history command and redirecting the output to a file, editing 
that file, and running those commands as a script after editing. Is there an 

easier way? 


Solution 


The fe command will put the most recent command (or a range of commands) 
in a temporary file, invoke your editor and let you edit the command(s) any 
way you see fit, and then automatically rerun the edited version of the 
command(s) when you exit the editor. 
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Discussion 


Which editor will it invoke? It will use the one defined in the shell variable 
FCEDIT, if it is set. If that one is empty it will use the more general EDITOR 
variable, and if that is also empty it will use vi. 


Which line(s will appear in the editor? If you invoke fc with no arguments, it 
will use only the last line. You can also specify a particular line with a single 
argument: fc 1004 will use line 1004 from your command history, whereas 
fc -5 will use the line five previous from the most recent command. 
Similarly, you can specify a range of arguments, such as fc 1001 1005, 
which will let you edit lines 1001 through 1005 inclusive in your command 
history, or fc -5 -1, which allows you to edit the last five commands that 
you ran. 


WARNING 


When the fe command has invoked the editor it will rerun whatever commands 
are left in the file when you exit—even if you exit without making any changes. 
What if you change your mind and don’t want to execute any commands? Then 
you should delete all the lines in the file and write that empty file out before 
exiting. Don’t try to suspend the editor, either. That will leave the shell and 
terminal in limbo. 


See Also 

= Recipe 18.3, “Running Almost the Same Command” 
= Recipe 18.4, “Quick Substitution” 

= Recipe 18.5, “Reusing Arguments” 

= Recipe 18.7, “Playing It Safe” 


= man fc for more options, including making multiline edits without 
invoking an editor 
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Chapter 19. Tips and Traps: 
Common Goofs for Novices 


Nobody’s perfect. We all make mistakes, especially when we are first 
learning something new. We have all been there, done that. You know, the 
silly mistake that seems so obvious once you’ve had it explained, or the time 
you thought for sure that the system must be broken because you were doing 
it exactly right, only to find that you were off by one little character—one 
which made all the difference. Certain mistakes seem common, almost 
predictable, among beginners. We’ ve all had to learn the hard way that scripts 
don’t run unless you set execute permissions on them—a real newbie kind of 
error. Now that we’re experienced, we never make those mistakes anymore. 
What, never? Well, hardly ever. After all, nobody’s perfect. 


19.1 Forgetting to Set Execute Permissions 


Problem 


You’ve got your script all written and want to try it out, but when you go to 
run the script you get an error message: 


$ ./my.script 
bash: ./my.script: Permission denied 


$ 


Solution 


You have two choices. First, you could invoke bash and give it the name of 
the script as a parameter: 


bash my.script 
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Or second (and better still), you could set execute permissions on the script 
so that you can run it directly: 


chmod a+x my.script 
./my.script 


Discussion 


Either method will get the script running. You’ll probably want to set execute 
permissions on the script if you intend to use it over and over. You only have 
to do this once, thereafter allowing you to invoke it directly. With the 
permissions set it feels more like a command, since you don’t have to 
explicitly invoke bash (of course, behind the scenes bash is still being 
invoked, but you don’t have to type it). 


In setting the permissions here, we used a+x to give execute permissions to 
all. There’s little reason to restrict execute permissions on the file unless it is 
in some directory where others might accidentally encounter your executable 
(e.g., if as a system admin you were putting something of your own in 
/usr/bin). Besides, if the file has read permissions for all, then others can still 
execute the script if they use our first form of invocation, with the explicit 
reference to bash. In octal mode, common permissions on shell scripts are 
0700 for the suspicious/careful folk (giving read/write/execute permission to 
only the owner) and 0755 for the more open/carefree folk (giving read and 
execute permissions to all others). 


See Also 


= man chmod 

m Recipe 14.13, “Setting Permissions” 

= Recipe 15.1, “Finding bash Portably for #!” 

= Recipe 19.3, “Forgetting That the Current Directory Is Not in the $PATH” 


19.2 Fixing “No such file or directory” 
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Errors 


Problem 


You’ve set execute permissions as described in Recipe 19.1, but when you 
run the script you get a “No such file or directory” error. 


Solution 
Try running the script using bash explicitly: 


bash ./busted 


If it works, you have some kind of permissions error, or a typo in your 
shebang line. If you get a bunch more errors, you probably have the wrong 
line endings. This can happen if you’ve edited the file on Windows (perhaps 
via Samba), or if you’ve simply copied the file around. 


If you run the file command on your suspect script, it can tell you if your line 
endings are wrong. It may say something like this: 


$ file ./busted 

./busted: Bourne-Again shell script, ASCII text executable, with CRLF 
line 

terminators 


$ 


To fix it, try the dos2unix program if you have it, or see Recipe 8.11. Note 
that if you use dos2unix it will probably create a new file and delete the old 
one, which will change the permissions and might also change the owner or 
group and affect hard links. If you’re not sure what any of that means, the key 
point is that you'll probably have to chmod it again (Recipe 19.1). 


Discussion 


If you really do have bad line endings (1.e., anything that isn’t ASCII 10 or 
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hex 0a), the error you get depends on your shebang line. Here are some 
examples for a script named busted: 


$ cat busted 
#!/bin/bash - 
echo "Hello World!" 


# This works 
$ ./busted 
Hello World! 


# But if the file gets DOS Line endings, we get: 
$ ./busted 

: invalid option 

Usage: /bin/bash [GNU long option] [option] ... 
Lead] 


# Different shebang line 
$ cat ./busted 
#!/usr/bin/env bash 

echo "Hello World!" 


$ ./busted 
: No such file or directory 


See Also 


m Recipe 8.11, “Converting DOS Files to Linux Format” 
= Recipe 14.2, “Avoiding Interpreter Spoofing” 
= Recipe 15.1, “Finding bash Portably for #!” 


= Recipe 19.1, “Forgetting to Set Execute Permissions” 


19.3 Forgetting That the Current Directory Is 
Not in the $PATH 


Problem 
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You’ve got your script all written and want to try it out—you even 
remembered to add execute permissions to the script—but when you go to 
run it you get an error message: 


$ my.script 
bash: my.script: command not found 


$ 


Solution 


Either add the current directory to the $PATH variable, which we do not 
recommend, or reference the script via the current directory with a leading . / 
before the script name, as in: 


./my.script 


Discussion 


It is acommon mistake for beginners to forget to add the leading ./ to the 
name of the script that they want to execute. We have had a lot of discussion 
about the $PATH variable, so we won’t repeat ourselves here except to remind 
you of a solution for frequently used scripts. 


A common practice is to keep your useful and often-used scripts in a 
directory called bin inside of your home directory, and to add that bin 
directory to your $PATH variable so that you can execute those scripts without 
needing the leading . /. 


The important part about adding your own bin directory to your $PATH 
variable is to place the change that modifies your $PATH variable in the right 
startup script. You don’t want it in the .bashrc script because that gets 
invoked by every interactive subshell, which would mean that your path 
would get added to every time you “shell out” of an editor, or run some other 
commands. You don’t need repeated copies of your bin directory in the 
SPATH variable. 


Instead, put it in the appropriate login profile for bash. According to the bash 
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manpage, when you log in bash “looks for ~/bash_profile, ~/.bash_login, 
and ~/ profile, in that order, and reads and executes commands from the first 
one that exists and is readable.” So, edit whichever one of those you already 
have in your home directory or, if none exists, create ~/bash_profile and put 
this line in at the bottom of the file (or elsewhere if you understand enough of 
what else the profile is doing: 


PATH="${PATH}: SHOME/bin" 


See Also 

= Recipe 4.1, “Running Any Executable” 

= Recipe 14.3, “Setting a Secure $PATH” 

= Recipe 14.9, “Finding World-Writable Directories in Your $PATH” 
= Recipe 14.10, “Adding the Current Directory to the $PATH” 

= Recipe 15.2, “Setting a POSIX $PATH” 

= Recipe 16.4, “Changing Your $PATH Permanently” 

= Recipe 16.5, “Changing Your $PATH Temporarily” 

= Recipe 16.11, “Keeping a Private Stash of Utilities by Adding ~/bin” 
= Recipe 16.20, “Using Initialization Files Correctly” 


= Recipe 19.1, “Forgetting to Set Execute Permissions” 


19.4 Naming Your Script. “test” 


Problem 


You typed up a bash script to test out some of this interesting material that 
you’ve been reading about. You typed it exactly right, and you even 
remembered to set execute permissions on the file and put it in one of the 
directories in your $PATH, but when you try to run it, nothing happens. 
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Solution 


Name it something other than fest. That name is a shell builtin command. 


Discussion 


It is natural enough to want to name a file test when you just want a quick 
scratch file for trying out some small bit of code. The problem is that test is a 
shell builtin command, making it a kind of shell reserved word. You can see 
this with the type command: 


$ type test 
test is a shell builtin 
$ 


Since it is a builtin, no adjusting of the path will override this. You would 
have to create an alias, but we strongly advise against it in this case. Just 
name your script something else, or invoke it with a pathname, as in: ./test 
or /home/path/test. 


See Also 


= Recipe 19.1, “Forgetting to Set Execute Permissions” 

= Recipe 19.3, “Forgetting That the Current Directory Is Not in the $PATH” 
= “Builtin Commands” in Appendix A 

= “bash Reserved Words” in Appendix A 


19.5 Expecting to Change Exported Variables 


Problem 


You can’t get a subscript or script to pass an exported variable back to its 
parent shell or script. For example, the following script will set a value, 
invoke a second script, and then display the value after the second script 
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completes, so as to show what (if anything) has changed: 


$ cat first.sh 

# 

# a simple example of a common mistake 
# 

# set the value: 

export VAL=5 

printf "VAL=%d\n" SVAL 

# invoke our other script: 

. /second.sh 

# 

# now see what changed (hint: nothing! ) 
printf "%b" "back in first\n" 

printf "VAL=%d\n" SVAL 

$ 


The second script messes with the variable named $VAL, too: 


$ cat second.sh 

printf "%b" "in second\n" 

printf "initially VAL=%d\n" $VAL 
VAL=12 

printf "changed so VAL=%d\n" SVAL 
$ 


When you run the first script (which invokes the second one) here’s what you 
get: 


$ ./first.sh 
VAL=5 

in second 
initially VAL=5 
changed so VAL=10 
back in first 
VAL=5 

$ 


Solution 
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The old joke goes something like this: 
= Patient: “Doctor, it hurts when I do this.” 
= Doctor: “Then don’t do that.” 


The solution here is going to sound like the doctor’s advice: don’t do that. 
You will have to structure your shell scripts so that such a handoff is not 
necessary. One way to do that is by explicitly echoing the results of the 
second script so that the first script can invoke it with the $() operator (or `` 
for the old shell hands). In the first script, the line . /second.sh becomes 
VAL=$(./second.sh), and the second script has to echo the final value (and 
only the final value) to STDOUT (it could redirect its other messages to 
STDERR): 


$ cat second.sh 


printf "%b" "in second\n" >&2 
printf "initially VAL=%d\n" SVAL >&2 
VAL=12 
printf "changed so VAL=%d\n" $VAL >&2 
echo SVAL 
$ 

Discussion 


Exported environment variables are not globals that are shared between 
scripts. They are a one-way communication. All the exported environment 
variables are marshaled and passed together as part of the invocation of a 
Linux or Unix (sub)process (see the fork(2) manpage). There is no 
mechanism whereby these environment variables are passed back to the 
parent process. (Remember that a parent process can fork lots and lots of 
subprocesses...so if you could return values from a child process, which 
child’s values would the parent get?) 


See Also 
= man fork(2) 
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= Recipe 5.5, “Exporting Variables” 
= Recipe 10.4, “Defining Functions” 


= Recipe 10.5, “Using Functions: Parameters and Return Values” 


19.6 Forgetting Quotes Leads to “command not 
found” on Assignments 


Problem 


Your script is assigning some values to a variable, but when you run it, the 
shell reports “command not found” on part of what you thought you assigned 
to the variable: 


$ cat goofi.sh 
#!/bin/bash - 

# common goof: 

# X=SY $Z 

# isn't the same as 
# X="SY SZ" 

# 

OPT1=-1 

OPT2=-h 
ALLOPT=SOPT1 SOPT2 
ls SALLOPT . 


$ ./goofi.sh 
goofi.sh: line 9: -h: command not found 


aaa.awk cdscript.prev ifexpr.sh oldsrc xspin2.sh 


$ 


Solution 


You need quotes around the righthand side of the assignment to $ALLOPT. 
What is written in the script as: 


ALLOPT=$OPT1 SOPT2 
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really should be: 


ALLOPT="SOPT1 SOPT2" 


Discussion 


This problem arises because of the space between the arguments. If the 
arguments were separated by an intervening slash, for example, or if there 
were no space at all between them, this problem wouldn’t crop up—it would 
all be a single word, and thus a single assignment. 


But that intervening space tells bash to parse this into two words. The first 
word is a variable assignment. Such assignments at the beginning of a 
command tell bash to set a variable to a given value just for the duration of 
the command—the command being the word that follows next on the 
command line. At the next line, the variable is back to its prior value (if any) 
or just not set. 


The second word of our example statement is therefore seen as a command. 
That word is the command that is reported as “not found.” Of course, it is 
possible that the value for SOPT2 might have been something that actually 
was the name of an executable (though that’s not likely in this case, with /s). 
Such a situation could lead to very undesirable results. 


Did you notice, in our example, that when /s ran, it didn’t use the long-format 
output even though we had (tried to) set the -l option? That shows that 
SALLOPT was no longer set. It had only been set for the duration of the 
previous command, which was the (nonexistent) -A command bash attempted 
to run. 


An assignment on a line by itself sets a variable for the remainder of the 
script. An assignment at the beginning of a line, one that has an additional 
command invoked on that line, sets the variable only for the execution of that 
command. 


It’s generally a good idea to quote your assignments to a shell variable. That 
way you are assured of getting only one assignment and not encountering this 
problem. 


731 


See Also 


= Recipe 5.9, “Handling Parameters with Spaces” 


19.7 Forgetting that Pattern Matching 
Alphabetizes 


Problem 


When you try to specify a character order in a character class in pattern 
matching, the result is not in the order you specified. 


Solution 
bash will alphabetize the data in a pattern match: 


$ echo x.[ba] 
x.a x.b 


$ 


Discussion 


Even though you specified b then a in the square brackets, when the pattern 
matching is done and the results found, they will be alphabetized before 
being given to the command to execute. That means that you don’t want to do 
this: 


mv x.[ba] 
thinking that it will expand to: 
mv x.b x.a 


Rather, it will expand to: 
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mv x.a x.b 


since bash alpha-sorts the results before putting them in the command line, 
which is exactly the opposite of what you intended! 


However, if you use braces to enumerate your different values, it will keep 
them in the specified order. This will do what you intended and not change 
the order: 


mv x.{b,a} 


19.8 Forgetting that Pipelines Make Subshells 


Problem 


You have a script that works just fine, reading input in a while loop: 


# This works as expected 
COUNT=0 
while read ALINE 
do 
let COUNT++ 
done 
echo COUNT 


And then you change it like this, to read from a file, with the name of that file 
specified as the first parameter to the script: 


# Don't use; this does NOT work as expected! 
COUNT=0 
cat $1 | while read ALINE 
do 
let COUNT++ 
done 
echo $COUNT # SCOUNT is always 'Q', which is useless 


But now it no longer works; SCOUNT keeps coming out as zero. 


733 


Solution 


Pipelines create subshells. Changes in the while loop do not affect the 
variables in the outer part of the script, because this while loop, as with each 
command of a pipeline, is run in a subshell. (The cat command is run in a 
subshell, too, but it doesn’t alter shell variables. 


One solution: don’t do that (if you can help it. That is, don’t use a pipeline. 

In this example, there was no need to use cat to pipe the file’s content into the 
while statement—you could use I/O redirection rather than setting up a 
pipeline: 


# Avoid the | and subshell; use "done < $1" instead 
# It now works as expected 
COUNT=0 
while read ALINE 
do 
let COUNT++ 
done < $1 # <<<< This is the line with the key difference 
echo SCOUNT 


Such an easy rearrangement might not work for your problem, however, in 
which case you’ll have to use another technique. 


As of version 4 of bash, you can prevent this problem in a script simply by 
setting the shell option Lastpipe early on in the script: 


shopt -s Lastpipe 


If that still doesn’t work or yov’re using a version of bash older than 4.0, see 
the discussion. 


Discussion 


If you add an echo statement inside the while loop of the example script, you 
can see $COUNT increasing, but once you exit the loop, COUNT will be back to 
zero. The way that bash sets up the pipeline of commands means that each 
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command in the pipeline will execute in its own subshell. So the while loop 
is in a subshell, not in the main shell. The while loop will begin with the 
same value that the main shell script was using for $COUNT, but since the 
while loop is executing in a subshell there is no way to get the value back up 
to the parent shell. 


One approach to deal with this is to take all the additional work and make it 
part of the same subshell that includes the while loop. For example: 


COUNT=0 
cat $1 | { while read ALINE 
do 
Let COUNT++ 
done 
echo SCOUNT ; } # Spaces are important here 


The placement of the braces is crucial here. What we’ve done is explicitly 
delineate a section of the script to be run together in the same (sub)shell. It 
includes both the while loop and the other work that we want to do after the 
while loop completes (here all we’re doing is echoing $COUNT). Since the 
while and echo statements are not connected via a pipeline, they will both 
run in the same subshell delineated by the braces. The $COUNT that was 
accumulated during the while loop will remain until the end of the subshell 
—that is, until the close brace is reached. 


If you use this technique it might be good to format the statements a bit 
differently, to make the use of the bracketed subshell stand out more. Here’s 
the example script reformatted: 


COUNT=0 
cat $1 | 
{ 
while read ALINE 
do 
let COUNT++ 
done 
echo $COUNT 
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This issue can be avoided altogether if you are using version 4 of bash. In 
your script, simply set the shell option Lastpipe (some sysadmins might 
even want to set this in their /etc/profile or a related .rc file so no one else 
needs to set it: 


shopt -s lLastpipe 


This option tells the shell to run the last command of a pipeline in the current 
shell, rather than a subshell, thereby making its variables available to the rest 
of the shell script that comes after the pipeline. 


Here is an example similar to the previous example, though it uses /s rather 
than cat as the source of its data: 


shopt -s Lastpipe # as of ver. 4 bash 
COUNT=0 
ls | while read ALINE 
do 
let COUNT++ 
done 
echo SCOUNT 


Try it with and without the shopt statement and you can see the effect. 


WARNING 
The Lastpipe behavior only works if job control is disabled, which is the 
default condition for noninteractive shells (i.e., bash scripts). If you want to use 
Lastpipe interactively, then you need to disable job control with set +m—but 
in doing so you lose the ability to interupt (^C) or to suspend (^Z) a running 
command, and you cannot use the fg and bg commands. We recommend against 
doing so. 


See Also 
= Question E4 in the bash FAQ, version 4.14 
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= Recipe 10.5, “Using Functions: Parameters and Return Values” 


= Recipe 19.5, “Expecting to Change Exported Variables” 


19.9 Making Your Terminal Sane Again 


Problem 


You have aborted an SSH session and now you can’t see what you are typing. 
Or perhaps you accidentally displayed a binary file and your terminal 
window is now full of gibberish. 


Solution 


Type stty sane and then press the Enter key, even if you can’t see what you 
are typing, to restore sane terminal settings. You may want to hit Enter a few 
times first, to make sure you don’t have anything else on your input line 
before you start typing the stty command. 


If you do this a lot, you might consider creating an alias that’s easier to type 
blind (see Recipe 10.7). 


Discussion 


Aborting some older versions of ssh at a password prompt may leave 
terminal echo (the displaying of characters as you type them, not the shell 
echo command) turned off so you can’t see what you are typing. Depending 
on what kind of terminal emulation you are using, displaying a binary file can 
also accidentally change terminal settings. In either case, stty’s sane setting 
attempts to return all terminal settings to their default values. This includes 
restoring echo capability, so that what you type on the keyboard appears in 
your terminal window. It will also likely undo whatever strangeness has 
occurred with other terminal settings. 


Your terminal application may have some kind of reset function too, so 
explore the menu options and documentation. You may also want to try the 
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reset and tset commands, though in our testing stty sane worked as desired 
while reset and tset were more drastic in what they fixed. 


See Also 


= man reset 
= man stty 
= man tset 


= Recipe 10.7, “Redefining Commands with alias” 


19.10 Deleting Files Using an Empty Variable 


Problem 


You have a variable that you think contains a list of files to delete, perhaps to 
clean up after your script. But in fact, the variable is empty and Bad Things 
happen. 


Solution 
Never do: 

rm -rf $files_to_delete 
Never, ever, ever do: 

rm -rf /Sfiles_to_delete 
Use this instead: 


[ -n "Sfiles_to_delete" ] && rm -rf $files_to delete 
Discussion 
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The first example isn’t too bad; it’ll just throw an error. The second one is 
pretty bad because it will try to delete your root directory. If you are running 
as a regular user (and you should be—see Recipe 14.18, it may not be too 
bad, but if you are running as root then you’ ve just killed your system but 
good. (Yes, we’ve done this. 


The solution is easy. First, make sure that there is some value in the variable 
you’re using, and second, never precede that variable with a /. 


See Also 


= Recipe 14.18, “Running as a Non-root User” 


= Recipe 18.7, “Playing It Safe” 


19.11 Seeing Odd Behavior from printf 


Problem 


Your script is giving you values that don’t match what you expected. 
Consider this simple script and its output: 


$ bash oddscript 
good nodes: 0 

bad nodes: 6 

miss nodes: 0 
GOOD=6 BAD=0 MISS=0 


$ cat oddscript 
#!/bin/bash - 
badnode=6 


printf "good nodes: %d\n" Sgoodnode 
printf "bad nodes: %d\n" Sbadnode 


printf "miss nodes: %d\n" Smissnode 
printf "GOOD=%d BAD=%d MISS=%d\n" Sgoodnode $badnode $missnode 


Why is 6 showing up as the value for the good count, when it is supposed to 
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be the value for the bad count? 


Solution 


Either give the variables an initial value (e.g., 0) or put quotes around the 
references to them on printf lines. 


Discussion 


What’s happening here? bash does its substitutions on that last line, and when 
it evaluates $goodnode and $missnode they both come out null, empty, not 
there. So the line that is handed off to printf to execute looks like this: 


printf "GOOD=%d BAD=%d MISS=%d\n" 6 


When printf tries to print the three decimal values (the three %d formats), it 
has a value (i.e., 6) for the first one but doesn’t have anything for the next 
two, so they come out as 0 and you get: 


GOOD=6 BAD=0 MISS=0 


You can’t really blame printf, since it never saw the other arguments; bash 
had done its parameter substitution before printf ever got to run. 


Even declaring them as integer values, like this: 
declare -i goodnode badnode missnode 


isn’t enough. You need to actually assign them a value. 


The other way to avoid this problem is to quote the arguments when they are 
used in the printf statement, like this: 


printf "GOOD=%d BAD=%d MISS=%d\n" "Sgoodnode" "Sbadnode" "Smissnode" 


Then the first argument won’t disappear, but an empty string will be put in its 
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place, so that what printf gets is the three needed arguments: 
printf "GOOD=%d BAD=%d MISS=%d\n" "" "6" "" 


While we’re on the subject of printf, it has one other odd behavior. We have 
just seen how it behaves when there are too few arguments; when there are 
too many arguments, printf will keep repeating and reusing the format line 
and it will look like you are getting multiple lines of output when you 
expected only one. 


Of course, this can be put to good use, as in the following case: 


$ dirs 

/usr/bin /tmp ~/scratch/misc 
$ printf "%s\n" $(dirs) 
/usr/bin 

/tmp 

~/scratch/misc 


$ 


Here, printf takes the directory stack (1.e., the output from the dirs command) 
and displays the directories one per line, repeating and reusing the format, as 
described earlier. 


Let’s summarize the best practices: 


= Initialize your variables, especially if they are numbers and you want to 
use them in printf statements. 


= Put quotes around your arguments if they could ever be null, and 
especially when used in printf statements. 


= Make sure you have the correct number of arguments, especially 
considering what the line will look like after the shell substitutions have 
occurred. 


= The safest way to display an arbitrary string is to use printf '%s\n' 
"$string". 


741 


See Also 

= /ttp://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf. html 
= Recipe 2.3, “Writing Output with More Formatting Control” 

= Recipe 2.4, “Writing Output Without the Newline” 

= Recipe 15.6, “Using echo Portably” 

a “printf” in Appendix A 


19. 12 Testing bash Script Syntax 


Problem 


You are editing a bash script and want to make sure that your syntax is 
correct. 


Solution 


Use the -n argument to bash to test syntax often, ideally after every save, and 
certainly before committing any changes to a revision control system: 


$ bash -n my_script 
$ echo 'echo "Broken line' >> my_script 
$ bash -n my_script 


my_script: line 4: unexpected EOF while looking for matching 
my_script: line 5: syntax error: unexpected end of file 


suet 


Discussion 


The -n option is tricky to find in the bash manpage or other reference 
material since it’s located under the set builtin. It is noted in passing in bash 
- -help for -D, but it is never explained there. This flag tells bash to “read 
commands but do not execute them,” which of course will find bash syntax 
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errors. 


As with all syntax checkers, this will not catch logic errors or syntax errors in 
other commands called by the script. 


See Also 

m man bash 

m bash --help 

m bash -c "help set" 

= Recipe 16.1, “bash Startup Options” 


19.13 Debugging Scripts 


Problem 


You can’t figure out what’s happening in your script and why it doesn’t work 
as expected. 


Solution 


Add set -x to the top of the script when you run it, or use set -x to turn on 
xtrace before a troublesome spot and set +x to turn it off after. You may 
also wish to experiment with the $PS4 prompt (Recipe 16.2). xtrace also 
works on the interactive command line. Example 19-1 is a script that we 
suspect is buggy. 


Example 19-1. ch19/buggy 
#!/usr/bin/env bash 

# cookbook filename: buggy 
# 


set -x 


result=$1 
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[ $result = 1 ] \ 
&& { echo "Result is 1; excellent." ; exit 0; Y 
|| { echo "Uh-oh, ummm, RUN AWAY! " ; exit 120; } 


Now we invoke this script, but first we set and export the value of the $PS4 
prompt. bash will print out the value of $PS4 before each command that it 
displays during an execution trace (1.e., after a set -x): 


$ export PS4='+xtrace ŠLINENO: ' 


$ echo $PS4 
+xtrace SLINENO: 


$ ./buggy 

+xtrace 4: result= 

+xtrace 6: '[' =1 ']' 

./buggy: Line 6: [: =: unary operator expected 
+xtrace 8: echo 'Uh-oh, ummm, RUN AWAY! ' 
Uh-oh, ummm, RUN AWAY! 


$ ./buggy 1 

+xtrace 4: result=1 

+xtrace 6: '[' 1=1 ']' 

+xtrace 7: echo ‘Result is 1; excellent.' 
Result is 1; excellent. 


$ ./buggy 2 

+xtrace 4: result=2 

+xtrace 6: '[' 2=1 ']' 

+xtrace 8: echo 'Uh-oh, ummm, RUN AWAY! ' 
Uh-oh, ummm, RUN AWAY! 


$ /tmp/jp-test.sh 3 

+xtrace 4: result=3 

+xtrace 6: '[' 3 =1 ']' 

+xtrace 8: echo 'Uh-oh, ummm, RUN AWAY! ' 
Uh-oh, ummm, RUN AWAY! 


Discussion 
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It may seem odd to turn something on using - and turn it off using +, but 
that’s just the way it worked out. Many Unix tools use -n for options or flags, 
and since you need a way to turn -x off, +x seems natural. 


As of bash 3.0 there are a number of new variables to better support 
debugging: $BASH_ARGC, S$BASH_ARGV, $SBASH_SOURCE, $BASH_LINENO, 
SBASH_SUBSHELL, $SBASH_EXECUTION_STRING, and SBASH_COMMAND. There is 
also a new extdebug shell option. These are in addition to existing bash 
variables like SLINENO and the array variable $FUNCNAME. 


From the Bash Reference Manual: 


If [extdebug is/ set at shell invocation, arrange to execute the debugger 
profile before the shell starts, identical to the - -debugger option. If set 
after invocation, behavior intended for use by debuggers is enabled: 


= The -F option to the declare builtin...displays the source file name and 
line number corresponding to each function name supplied as an 
argument. 


= Ifthe command run by the DEBUG trap returns a nonzero value, the next 
command is skipped and not executed. 


= If the command run by the DEBUG trap returns a value of 2, and the shell 
is executing in a subroutine (a shell function or a shell script executed 
by the . or source builtins), the shell simulates a call to return. 


= BASH_ARGC and BASH_ARGV are updated... 


= Function tracing is enabled: command substitution, shell functions, and 
subshells invoked with ( command ) inherit the DEBUG and RETURN 
traps. 


= Error tracing is enabled: command substitution, shell functions, and 


subshells invoked with ( command _) inherit the ERR trap. 


Using xtrace is a very handy debugging technique, but it is not the same as 
having a real debugger. For that, see the Bash Debugger Project, which 
contains patched sources to bash that enable better debugging support as well 
as improved error reporting. In addition, this project contains, in the 
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developer’s words, “the most comprehensive source-code debugger for 
BASH that has been written.” 


See Also 
m help set 


m man bash 


= Chapter 9 in Cameron Newham’s Learning the bash Shell, 3rd Edition 
(O’Reilly), which includes a shell script for debugging other shell scripts 


= Recipe 16.1, “bash Startup Options” 
= Recipe 16.2, “Customizing Your Prompt” 
= Recipe 17.1, “Renaming Many Files” 


€u /Attps://www.gnu.org/software/bash/manual/html_node/The-Shopt- 
Builtin.html 


a /ttp://bashdb.sourceforge.net/ 


19.14 Avoiding “command not found” When 
Using Functions 


Problem 


You are used to other languages, such as Perl, which allow you to call a 
function in a section of your code that comes before the actual function 
definition. 


Solution 


Shell scripts are read and executed in a top-to-bottom linear way, so you must 
define any functions before you use them. 


Discussion 
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Some other languages, such as Perl, go through intermediate steps during 
which the entire script is parsed as a unit. That allows you to write your code 
so that main() is at the top, and functions (or subroutines are defined later. 
By contrast, a shell script is read into memory and then executed one line at a 
time, so you can’t use a function before you define it. 


See Also 
= Recipe 10.4, “Defining Functions” 
m Recipe 10.5, “Using Functions: Parameters and Return Values” 


=» Appendix C 


19.15 Confusing Shell Wildcards and Regular 
Expressions 


Problem 


Sometimes you see .* sometimes just *, and sometimes you see [a-z]* but 
it means something other than what you thought. You use regular expressions 
for grep and sed but not in some places in bash. You can’t keep it all straight. 


Solution 


Relax; take a deep breath. You’re probably confused because you’re learning 
so much (or just using it too infrequently to remember it). Practice makes 
perfect, so keep trying. 


The rules aren’t that hard to remember for bash itself. After all, regular 
expression syntax is only used with the =~ comparison operator in bash. All 
of the other expressions in bash use shell pattern matching. 


Discussion 


The pattern matching used by bash uses some of the same symbols as regular 
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expressions, but with different meanings. But it is also the case that you often 
have calls in your shell scripts to commands that use regular expressions— 
commands like grep and sed. 


We asked Chet Ramey, the current keeper of the bash source and all-around 
bash guru, if it was really the case that the =~ was the only use of regular 
expressions in bash. He said yes. He also was kind enough to supply a list of 
the various parts of bash syntax that use shell pattern matching. We’ve 
covered most, but not all of these topics in various recipes in this book. We 
offer the list here for completeness. 


Shell pattern matching is performed by: 

= Filename globbing (pathname expansion) 
= == and != operators for [[ 

m case statements 

= SGLOBIGNORE handling 

m SHISTIGNORE handling 

= S{parameter#| #|word} 

=» S{parameter%[% |word} 

=» S{parameter/pattern/string} 


= Several bindable readline commands (glob-expand-word, glob-complete- 
word, etc.) 


m complete -Gand compgen -G 
m complete -X and compgen -X 


m The help builtin’s pattern argument 
Thanks, Chet! 


NOTE 


Learn to read the manpage for bash and refer to it often—it is long but precise. 
If you want an online version of the bash manpage or other bash-related 
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documents, visit http://www.bashcookbook.com for the latest bash information. 
Keep this book handy for reference, too. 


See Also 


m man bash 
= Recipe 5.18, “Changing Pieces of a String” 
= Recipe 6.6, “Testing for Equality” 


Recipe 6.7, “Testing with Pattern Matches” 


Recipe 6.8, “Testing with Regular Expressions” 


Recipe 13.15, “Trimming Whitespace” 
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Appendix A. Reference Lists 


This appendix collects many tables of values, settings, operators, commands, 
variables, and more in one place for easy reference. 


bash Invocation 


Here are the options you can use when invoking current versions of bash. 
The multi-character options must appear on the command line before the 
single-character options. Login shells usually have the options -i 
(interactive), -s (read from standard input), and -m (enable job control) set 
internally. 


In addition to those listed in Table A-1, any set option can be used on the 
command line; see “set Options”. In particular, the -n option is invaluable for 


syntax checking (see Recipe 19.12), and -x is used for debugging (Recipe 
19.13). 


For further reference, see http://bit.ly/2wtEjA8. 


Table A-1. Command-line options to bash 
Option Meaning 


-c string Commands are read from string, if present. Any arguments after string 
are interpreted as positional parameters, starting with $0. 


-D A list of all double-quoted strings preceded by $ is written to standard 
output. These are the strings that are subject to language translation when 
the current locale is not C or POSIX. This also turns on the -n option. 


-i Makes the shell an interactive shell. Ignores signals TERM, INT, and QUIT. 
With job control in effect, TTIN, TTOU, and TSTP are also ignored. 


-l Makes bash act as if it were invoked as a login shell. 
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-o option 


-0, 
+0 shopt- 
option 


- -debugger 


- -dump- 
strings 


- -dump-po- 
strings 


--help 
-- login 


noediting 


noprofile 


--norc 


Takes the same arguments as set -o (see “set Options”). 


shopt-option is one of the shell options accepted by the shopt builtin. If 
shopt-option is present, -0 sets the value of that option; +0 unsets it. If 
shopt-option is not supplied, the names and values of the shell options 
accepted by shopt are written to standard output. If the invocation option 
is +0, the output is displayed in a format that may be reused as input. 


Reads commands from standard input. If an argument is given to bash, 
this flag takes precedence (1.e., the argument won’t be treated as a script 
name and standard input will be read). 


Makes the shell a restricted shell. 
Prints shell input lines as they’re read. 


Signals the end of options and disables further option processing. Any 
options after this are treated as filenames and arguments. -- is 
synonymous with -. 


Arranges for the debugger profile to be executed before the shell starts. 
Turns on extended debugging mode and shell function tracing in bash 
3.0 or later. 


Does the same as -D. 

Does the same as -D, but the output is in the GNU gettext portable object 
(.po) file format. 

Displays a usage message and exits. 

Makes bash act as if it were invoked as a login shell. Same as -1. 


Does not use the GNU readline library to read command lines if the shell 
is interactive. 


Does not read the startup file /etc/profile or any of the personal 
initialization files. 


Does not read the initialization file ~/ bashrc if the shell is interactive. 
This is on by default if the shell is invoked as sh. 
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-- posix Changes the behavior of bash to follow the POSIX standard more closely 
where the default operation of bash is different. 


--rcfile Executes commands read from file instead of the initialization file 
file, -- ~/.bashrc, if the shell is interactive. 

init-file 

file 


Equivalent to -r. 
restricted 


--verbose Equivalent to -v. 


--version Shows the version number of this instance of bash and then exits. 


Prompt String Customizations 


Table A-2 shows a summary of the prompt customizations that are available. 
The customizations \[ and \] are not available in bash versions prior to 1.14. 
\a, \e, \H, \T, \@, \v, and \V are not available in versions prior to 2.0. \A, \D, 


\j, \l, and \r are only available in later versions of bash 2.0 and in bash 
3.0+. 


See Attp://bit.ly/2wlpQHf. 


Table A-2. Prompt string format codes 


Command Meaning 


\a The ASCII bell character (007). 

\A The current time in 24-hour HH:MM format. 

\d The date in “Weekday Month Day” format. 

\D The format is passed to strftime(3) and the result is inserted into the prompt 


{format} string; an empty format results in a locale-specific time representation. The 
braces are required. 


\e The ASCII escape character (033). 
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\H The hostname. 


\h The hostname up to the first .. 

\j The number of jobs currently managed by the shell. 

\l The basename of the shell’s terminal device name. 

\n A carriage return and line feed. 

\r A carriage return. 

\s The name of the shell. 

\T The current time in 12-hour HH:MM:SS format. 

\t The current time in 24-hour HH:MM:SS format. 

\@ The current time in 12-hour a.m./p.m. format. 

\u The username of the current user. 

\v The version of bash (e.g., 2.00). 

\V The release of bash (the version and patch level; e.g., 3.00.0). 
\w The current working directory, with $HOME abbreviated with a tilde (uses 


the $PROMPT_DIRTRIM variable). 


\W The basename of $PWD, with $HOME abbreviated with a tilde. 

\# The command number of the current command. 

\! The history number of the current command. 

\$ If the effective UID is 0, a #; otherwise, a $. 

nnn The character code in octal. 

\\ A backslash. 

\[ Begin a sequence of nonprinting characters, such as a terminal control 
sequence. 
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\] End a sequence of nonprinting characters. 


ANSI Color Escape Sequences 
Table A-3 shows the ANSI color escape sequences. 


Table A-3. ANSI color escape sequences 


Code Character FG Foreground BG Background 
attribute code color code color 
0 Reset all attributes 30 Black 40 Black 
1 Bright 31 Red 41 Red 
2 Dim 32 Green 42 Green 
4 Underscore 33 Yellow 43 Yellow 
5 Blink 34 Blue 44 Blue 
/ Reverse 35 Magenta 45 Magenta 
8 Hidden 36 Cyan 46 Cyan 
37 White 47 White 


Builtin Commands 


Table A-4 lists the builtin commands in current versions of bash (see 
http://bit.ly/2wlut4o). 


Table A-4. Builtin commands 
Command Summary 
Read a file and execute its contents in the current shell. See source. 


Do nothing (just do expansions of any arguments). 
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[ Evaluate a conditional expression. See test. 


alias Set up shorthand for a command or command line. 

bg Put a job in the background. 

bind Bind a key sequence to a readline function or macro. 

break Exit from the surrounding for, select, while, or until loop. 


builtin Execute the specified shell builtin. 


caller Return the context of any active subroutine call (a shell function or a 
script executed with the . or source builtins). 


cd Change the working directory. 

command Run a command, bypassing shell function lookup. 
compgen Generate possible completion matches. 

complete Specify how completion should be performed. 


compopt Modify completion options for each name according to the options, or for 
the currently executing completion if no names are supplied. 


continue Skip to the next iteration of the surrounding for, select, while, or until 
loop. 


declare Declare variables and give them attributes. Same as typeset. 
dirs Display the list of currently remembered directories. 
disown Remove a job from the job table. 

echo Output arguments. 


enable Enable and disable (with -n) builtin shell commands. 


eval Run the given arguments through command-line processing. 
exec Replace the shell with the given program. 
exit Exit from the shell. 
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export 
fe 

Ig 
getopts 
hash 
help 
history 
jobs 
kill 

let 
local 


logout 


mapfile 


popd 


printf 


pushd 
pwd 
read 


readarray 


readonly 


return 


Create environment variables. 

Fix command (edit the history file). 

End a background job in the foreground. 

Process command-line options. 

Remember full pathnames of the specified commands. 
Display helpful information on builtin commands. 
Display the command history. 

List any background jobs. 

Send a signal to a process. 

Arithmetic variable assignment. 

Create a local variable. 

Exit a login shell. 


Read lines from standard input or a file descriptor into the indexed array 
variable array. See readarray. 


Remove a directory from the directory stack. 


Write the formatted arguments to standard output under the control of the 
format. 


Add a directory to the directory stack. 
Print the working directory. 
Read a line from standard input. 


Read lines from standard input or a file descriptor into the indexed array 
variable array. See mapfile. 


Make variables read-only (unassignable). 


Return from the surrounding function or script. 
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set 

shift 
shopt 
source 
suspend 
test 


times 


trap 
type 
typeset 
ulimit 
umask 
unalias 
unset 


wait 


Set options. 

Shift command-line arguments. 

Toggle the values of settings controlling optional shell behavior. 
Read a file and execute its contents in the current shell. See . (dot). 
Suspend execution of a shell. 

Evaluate a conditional expression. See /. 


Print the accumulated user and system times for processes run from the 
shell. 


Set up a signal-catching routine. 

Identify the source of a command. 

Declare variables and give them attributes. Same as declare. 
Set/show process resource limits. 

Set/show thefile permission mask. 

Remove alias definitions. 

Remove definitions of variables or functions. 


Wait for background job(s) to finish. 


bash Reserved Words 


Table A-5 lists the reserved words in current versions of bash (see 
http://bit.ly/2wlxEZO). 


Table A-5. bash reserved words 


Command Summary 


Logical NOT of a command’s exit status. 
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[E ]] Return a status of 0 or 1 depending on the evaluation of the conditional 


expression. 

(€ )) Evaluate the arithmetic expression according to the bash shell arithmetic 
rules. 

O Execute the list in a subshell. 

{} Execute the list in the current shell context. 

case Multiway conditional construct. 

do Part of a for, select, while, or until looping construct. 

done Part of a for, select, while, or until looping construct. 

elif Part of an if construct. 

else Part of an if construct. 

esac End of a case construct. 

Fi End of an if construct. 

for Looping construct. 


function Define a function. 

if Conditional construct. 

in Part of a case construct. 
select | Menu-generation construct. 
then Part of an if construct. 


time Run the command pipeline and print execution times. The format of the 
output can be controlled with TIMEFORMAT. 


until Looping construct. 


while Looping construct. 
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Builtin Shell Variables 


Table A-6 shows a complete list of environment variables available in bash 
4.4. The letters in the Type column of the table have the following meanings: 
A = array, L = colon-separated list, R = read-only, U = unsetting it causes it 
to lose its special meaning. 


Note that the variables beginning BASH_ or COMP as well as the variables 
DIRSTACK, FUNCNAME, GLOBIGNORE, GROUPS, HISTIGNORE, HOSTNAME, 
HISTTIMEFORMAT, LANG, LC_ALL, LC_COLLATE, LC_MESSAGE, MACHTYPE, 
PIPESTATUS, SHELLOPTS, and TIMEFORMAT are not available in versions prior 
to 2.0. BASH_ENV replaces ENV, found in earlier versions. 


See http.//bit.ly/2v2Xxcr. 


Table A-6. Builtin shell environment variables 
Variable Type Description 


* R The positional parameters given to the current script 
or function. If not double-quoted, each word is further 
split and expanded. If double-quoted, this returns a 
single string containing each argument separated by 
the first character of $IFS (e.g., "arg1 arg2 arg3") or 
with no separator if SIFS is null. 


@ R Each of the positional parameters given to the current 
script or function, given as a list of double-quoted 
strings (e.g., "arg1" "arg2" "arg3"). 


# R The number of arguments given to the current script 
or function. 


The exit status of the previous command. 
The options given to the shell on invocation. 


The process ID of the shell process. 


v 
yxy W N W 


The process ID of the last background command. 
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auto_resume 


BASH 


BASHOPTS 


BASHPID 


BASH_ALIASES 


BASH_ARGC 


BASH_ARGV 


BASH_CMDS 


BASH_COMMAND 


BASH_ENV 


BASH_EXECUTION_STRING 


The name of the shell or shell script. 
The last argument to the previous command. 


Controls how job control works (values are exact, 
substring, or something other than those keywords). 


The full pathname used to invoke this instance of 
bash. 


A colon-separated list of enabled shell options. 
Expands to the process ID of the current bash process. 


An associative array variable whose members 
correspond to the internal list of aliases as maintained 
by the alias builtin. 


An array of values, which are the number of 
parameters in each frame of the current bash 
execution call stack. The number of parameters to the 
current subroutine (shell function or script executed 
with . or source) is at the top of the stack. 


All of the parameters in the current bash execution 
call stack. The final parameter of the last subroutine 
call is at the top of the stack; the first parameter of the 
initial call is at the bottom. 


An associative array variable whose members 
correspond to the internal hash table of commands as 
maintained by the hash builtin. 


The command currently being executed or about to be 
executed, unless the shell is executing a command as 
the result of a trap, in which case it is the command 
executing at the time of the trap. 


The name of a file to run as the environment file when 
the shell is invoked. 


The command argument to the -c invocation option. 
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BASH_LINENO 


BASH_LOADABLES_PATH 


BASH_REMATCH 


BASH_SOURCE 


BASH_SUBSHELL 


BASH_VERSINFO 


BASH_VERSION 


BASH_XTRACEFD 


CDPATH 


CHILD_MAX 


L 


AR 


AR 


An array whose members are the line numbers in 
source files corresponding to each member of 
@var{FUNCNAME}. ${BASHLINENO[$i]} is the line number 
in the source file where ${FUNCNAME[$i + 1]} was 
called. The corresponding source filename is 
S{BASHSOURCE[$i + 1]}. 


A colon-separated list of directories in which the shell 
looks for dynamically loadable builtins specified by 
the enable command. 


An array whose members are assigned by the =~ 
binary operator to the [[ conditional command. The 
element with index 0 is the portion of the string 
matching the entire regular expression. The element 
with index n is the portion of the string matching the 
nth parenthesized subexpression. 


An array containing the source filenames 
corresponding to the elements in the $FUNCNAME array 
variable. 


Incremented by 1 each time a subshell or subshell 
environment is spawned. The initial value is 0. A 
subshell is a forked copy of the parent shell and shares 
its environment. 


Version information for this instance of bash. Each 
element of the array holds parts of the version 
number. 


The version number of this instance of bash. 


If set to an integer corresponding to a valid file 
descriptor, bash will write the trace output generated 
when set -x is enabled to that file descriptor. 


A list of directories for the cd command to search. 


The number of exited child status values for the shell 
to remember. 
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COLUMNS 


COMP_CWORD 


COMP_LINE 


COMP_POINT 


COMP_WORDS 


COMPREPLY 


COMPREPLY 


COMP_KEY 


COMP_TYPE 


COPROC 


Used by the select command to determine the terminal 
width when printing selection lists. 


An index into ${COMPWORDS} of the word containing the 
current cursor position. This variable is available only 
in shell functions invoked by the programmable 
completion facilities. 


The current command line. This variable is available 
only in shell functions and external commands 
invoked by the programmable completion facilities. 


The index of the current cursor position relative to the 
beginning of the current command. If the current 
cursor position is at the end of the current command, 
the value of this variable is equal to ${#COMPLINE}. 
This variable is available only in shell functions and 
external commands invoked by the programmable 
completion facilities. 


An array of the individual words in the current 
command line. This variable is available only in shell 
functions invoked by the programmable completion 
facilities. 


The possible completions generated by a shell 
function invoked by the programmable completion 
facilities. 


An array variable from which bash reads the possible 
completions generated by a shell function invoked by 
the programmable completion facilities. 


The key (or final key of a key sequence) used to 
invoke the current completion function. 


An integer value corresponding to the type of 
completion attempted that caused a completion 
function to be called. 


An array variable created to hold the file descriptors 
for output from and input to an unnamed coprocess. 
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DIRSTACK 


echo-control- 
characters 


editing-mode 


EMACS 


ENV 


EUID 


EXECIGNORE 


FCEDIT 


FIGNORE 


FUNCNAME 


FUNCNEST 


GLOBIGNORE 


GROUPS 


ARU 


ARU 


The current contents of the directory stack. 


When set to on, on operating systems that indicate 
they support it, readline echoes a character 
corresponding to a signal generated from the 
keyboard. 


Controls which default set of key bindings is used. 


If bash finds this variable in the environment when 
the shell starts with the value t, it assumes that the 
shell is running in an Emacs shell buffer and disables 
line editing. 


Similar to BASH_ENV; used when the shell is invoked in 
POSIX mode. 


The effective user ID of the current user. 


A colon-separated list of shell patterns defining the 
list of filenames to be ignored by command search 
using PATH. 


The default editor for the fe command. 


A list of names to ignore when doing filename 
completion. 


An array containing the names of all shell functions 
currently in the execution call stack. The element with 
index 0 is the name of any currently executing shell 
function. The bottom-most element is “main.” This 
variable exists only when a shell function is 
executing. 


If set to a numeric value greater than 0, defines a 
maximum function nesting level. 


A list of patterns defining filenames to ignore during 
pathname expansion. 


An array containing a list of groups of which the 
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histchars 


HISTCMD 


HISTCONTROL 


HISTFILE 


HISTFILESIZE 


HISTIGNORE 


HISTSIZE 


HISTTIMEFORMAT 


HOME 


HOSTFILE 


HOSTNAME 


HOSTTYPE 


current user is a member. 


Specifies what to use as the history control characters. 
Normally set to the string !^#. 


The history number of the current command. 


A list of patterns, separated by colons (:), which can 
have the following values: ignorespace: (lines 
beginning with a space are not entered into the history 
list), ignoredups: (lines matching the last history line 
are not saved in the history list),), erasedups: (all 
previous lines matching the current line are removed 
from the history list before the line is saved), or 
ignoreboth: (enables both ignorespace and 
ignoredups). 


The name of the command history file. 


The maximum number of lines to keep in the history 
file. 


A list of patterns used to determine what should be 
retained in the history list. 


The maximum number of commands to keep in the 
command history. 


If set, timestamps are written to the history file so they 
may be preserved across shell sessions. If not null, 
this variable’s value is used as a format string for 
strftime(3) to print the timestamp associated with each 
history entry displayed by the history builtin. 


The home (login) directory. 
The file to be used for hostname completion. 
The name of the current host. 


The type of machine bash is running on. 
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IFS 


IGNOREEOF 


INPUTRC 


LANG 


LC_ALL 


LC_COLLATE 


LC_CTYPE 


LC_MESSAGES 


LC_NUMERIC 


LC_TIME 


LINENO 


LINES 


MACHTYPE 


MAIL 


The internal field separator: a list of characters that act 
as word separators. Normally set to space, tab, and 
newline. 


The number of EOF characters that can be received 
before exiting an interactive shell. 


The readline startup file. 


Used to determine the locale category for any 
category where one is not specifically set with a 
variable starting with LC_. 


Overrides the value of LANG and any other LC_ variable 
specifying a locale category. 


Determines the collation order used when sorting the 
results of pathname expansion. 


Determines the interpretation of characters and the 
behavior of character classes within pathname 
expansion and pattern matching. 


Determines the locale used to translate double-quoted 
strings preceded by a $. 


Determines the locale category used for number 
formatting. 


Determines the locale category used for date and time 
formatting. 


The number of the line that just ran in a script or 
function. 


Used by the select command to determine the column 
length for printing selection lists. 


A string describing the system on which bash is 
executing. 


The name of the file to check for new mail. 
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MAILCHECK 


MAILPATH 


MAPFILE 


mark-directories 


OLDPWD 


OPTARG 


OPTERR 


OPTIND 


OSTYPE 


PATH 


PIPESTATUS 


POSTXLY_CORRECT 


PPID 


PROMPT_COMMAND 


PROMPT_DIRTRIM 


How often (in seconds) to check for new mail. 


A list of filenames to check for new mail, if MAIL is 
not set. 


An array variable created to hold the text read by the 
mapfile builtin when no variable name is supplied. 


If set to on, completed directory names have a slash 
appended. 


The previous working directory. 


The value of the last option argument processed by 
getopts. 


If set to 1, display error messages from getopts. 


The index of the last option argument processed by 
getopts. 


The operating system on which bash is executing. 
The search path for commands. 


An array variable containing a list of exit status values 
from the processes in the most recently executed 
foreground pipeline. 


If this is set in the environment when bash starts, the 
shell enters POSIX mode before reading the startup 
files, as if the - -posix invocation option had been 
supplied. If it is set while the shell is running, bash 
enables POSIX mode, as if the command set -o 
posix had been executed. 


The process ID of the shell’s parent process. 


The value is executed as a command before the 
primary prompt is issued. 


If set to a number greater than 0, the value is used as 
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PSO 


PS1 


PS2 


PS3 


PS4 


PPID 


PWD 


RANDOM 


READLINE_LINE 


READLINE_POINT 


REPLY 


SECONDS 


SHELL 


SHELLOPTS 


SHLVL 


LR 


the number of trailing directory components to retain 
when expanding the \w and \w prompt string escapes. 


If set, the prompt string is displayed in an interactive 
shell after reading a command and before the 
command is executed. Only available in bash 4.4 and 
newer. 


The primary command prompt string. 

The prompt string for line continuations. 

The prompt string for the select command. 

The prompt string for the xtrace option. 

The process ID of the parent process. 

The current working directory as set by the cd builtin. 


When referenced, generates a pseudorandom integer 
between 0 and 32,767. 


The contents of the readline line buffer, for use with 
bind -x. 


The position of the insertion point in the readline line 
buffer, for use with bind -x. 


The user’s response to the select command; also, the 
result of the read command if no variable names are 
given. 


The number of seconds since the shell was started. 
The full pathname of the shell. 
A colon-separated list of enabled shell options. 


Incremented by 1 each time a new instance (not a 
subshell) of bash is invoked. This is intended to be a 
count of how deeply your bash shells are nested. 
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TEXTDOMAIN Indicates the location of the message catalog files if 
not using LC_MESSAGES. 


TEXTDOMAINDIR Indicates the location of the message catalog files if 
using TEXTDOMAIN. 


TIMEFORMAT Specifies the format for the output from using the time 
reserved word on a command pipeline. 


TMOUT If set to a positive integer, the number of seconds after 
which the shell automatically terminates if no input is 
received. 

TMPDIR If set, the name of the directory in which bash creates 


temporary files for the shell’s use. 


UID R The numeric real user ID of the current user. 


set Options 


The options in Table A-7 can be turned on with the set -arg command. 
They are all initially off except where noted. Full names, where listed, are 
arguments to set that can be used with set -o. The full names 
braceexpand, histexpand, history, keyword, and onecmd are not available 
in versions of bash prior to 2.0. Also, in those versions, hashing is switched 
with -d. 


The mode string variables are only displayed when the show-mode-in- 
prompt readline variable is enabled (see Table A-22). 


For further reference, see http.//bit.ly/2uUXPpQ. 


Table A-7. set options 


Option Full name Meaning 


(-o) 
-a allexport Export all subsequently defined or modified variables. 
-B braceexpand Perform brace expansion. This is on by default. 
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notify 


noc lobber 


errtrace 


errexit 


emacs 


noglob 


histexpand 


history 


hashall 


ignoreeof 


keyword 


monitor 


noexec 


physical 


Report the status of terminating background jobs immediately. 


Prevent output redirection using >, >&, and <> from overwriting 
existing files. 


If set, any trap on ERR is inherited by shell functions, command 
substitutions, and commands executed in a subshell 
environment. 


Exit the shell when a simple command exits with nonzero 
status. A simple command is a command not part of a while, 
until, or if; nor part of a && or || list; nor a command whose 
return value is inverted by !. 


Use an Emacs-style line editing interface. This also affects the 
editing interface used for read -e. 


Disable filename expansion (globbing). 


Enable !-style history substitution. This option is on by default 
for interactive shells. 


Enable command history. On by default in interactive shells. 


Locate and remember (hash) commands as they are looked up 
for execution. This option is enabled by default. 


Prevent an interactive shell from exiting upon reading EOF 
(Ctrl-D). 


All arguments in the form of assignment statements are placed 
in the environment for a command, not just those that precede 
the command name. 


Enable job control. On by default in interactive shells. 


Read commands but do not execute them. This may be used to 
check a script for syntax errors. This option is ignored by 
interactive shells. 


If set, do not resolve symbolic links when performing 
commands such as cd that change the current directory. 


769 


-p privileged Turn on privileged mode. 


pipefail The return value of a pipeline is the value of the last 
(rightmost) command to exit with a nonzero status, or zero if 
all commands in the pipeline exit successfully. This option is 
disabled by default. 


posix Change the behavior of bash where the default operation 
differs from the POSIX standard to match the standard. 


-T functrace If set, any traps on DEBUG and RETURN are inherited by shell 
functions, command substitutions, and commands executed in 
a subshell environment. 


-t onecmd Exit after reading and executing one command. 


-u nounset Treat unset variables and parameters other than the special 
parameters @ or * as an error (not null) when performing 
parameter expansion. 


-v verbose Print shell input lines before running them. 
vi Use vi-style command-line editing. 
-X xtrace Print commands (after expansions) before running them. 


-- If no arguments follow this option, then the positional 
parameters are unset. Otherwise, the positional parameters are 
set to the arguments, even if some of them begin with a -. 


- Signals the end of options. All remaining arguments are 
assigned to the positional parameters. -x and -v are turned off. 
If there are no remaining arguments to set, the positional 
arguments remain unchanged. 


shopt Options 


The shopt options are set with shopt -s arg and unset with shopt -u arg 
(see Table A-8). Versions of bash prior to 2.0 had environment variables to 
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perform some of these settings. Setting them equated to shopt -s. The 
variables (and corresponding shopt options were 
allow_null_glob_expansion (nullglob), cdable_vars 
(cdable_vars), command oriented_history (cmdhist), 
glob_dot_filenames (dotglob), and no_exit_ on_failed_exec 
(execfail). These variables no longer exist. 


The options extdebug, failglob, force_fignore, and gnu_errfmt are not 
available in versions of bash prior to 3.0. 


For further reference, see http://bit.ly/2vPjVsC. 


Table A-8. shopt options 
Option Meaning if set 


autocd A command name that is the name of a directory is 
executed as if it were the argument to the cd command. 


cdable_vars An argument to cd that is not a directory is assumed to be 
the name of a variable whose value is the directory to 
change to. 

cdspell Minor errors in the spelling of a directory name supplied 


to the cd command will be corrected if there is a suitable 
match. This correction includes missing letters, incorrect 
letters, and letter transposition. This option works for 
interactive shells only. 


checkhash Commands found in the hash table are checked for 
existence before being executed, and nonexistence forces a 
$PATH Search. 


check jobs bash lists the status of any stopped and running jobs 
before exiting an interactive shell. 


checkwinsize bash checks the window size after each command and, if 
necessary, updates the values of LINES and COLUMNS. 


cmdhist bash attempts to save all lines of a multiline command in a 
single history entry. 
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compat32 


compat40 


compat41 


compat42 


compat43 


complete_fullLquote 


direxpand 


dirspell 


dotglob 


execfail 


expand_aliases 


extdebug 


bash changes its behavior to that of version 3.2 with 
respect to locale-specific string comparison when using 
certain commands and operators. 


bash changes its behavior to that of version 4.0 with 
respect to locale-specific string comparison when using 
certain commands and operators. 


bash, when in POSIX mode, treats a single quote in a 
double-quoted parameter expansion as a special character. 


bash does not process the replacement string in the pattern 
substitution word expansion using quote removal. 


bash changes its behavior to that of version 4.3 with 
respect to certain operations. 


bash quotes all shell metacharacters in filenames and 
directory names when performing completion. 


bash replaces directory names with the results of word 
expansion when performing filename completion. 


bash attempts spelling correction on directory names 
during word completion if the directory name initially 
supplied does not exist. 


bash includes filenames beginning with a dot in the results 
of filename expansion. 


A noninteractive shell will not exit if it cannot execute the 
argument to an exec command. Interactive shells do not 
exit if an exec fails. 


Aliases are expanded. 


Behavior intended for use by debuggers is enabled. For 
example, the -F option of declare displays the source 
filename and line number corresponding to each function 
name supplied as an argument; if the command run by the 
DEBUG trap returns a nonzero value, the next command is 
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extglob 


extquote 


failglob 


force_fignore 


globasciiranges 


globstar 


gnu_errfmt 


histappend 


histreedit 


histverify 


hostcompLete 


skipped and not executed; and if the command run by the 
DEBUG trap returns a value of 2, and the shell is executing in 
a subroutine, a call to return is simulated. 


Extended pattern-matching features are enabled. 


If set, $string and $"string" quoting is performed within 
${parameter} expansions enclosed in double quotes. 


Patterns that fail to match filenames during pathname 
expansion result in an expansion error. 


The suffixes specified by the $FIGNORE shell variable cause 
words to be ignored when performing word completion 
even if the ignored words are the only possible 
completions. 


Range expressions used in pattern-matching bracket 
expressions behave as if in the traditional C locale when 
performing comparisons. 


The pattern ** used in a filename expansion context will 
match all files and zero or more directories and 
subdirectories. 


Shell error messages are written in the standard GNU error 
message format. 


The history list is appended to the file named by the value 
of the variable SHISTFILE when the shell exits, rather than 
overwriting the file. 


If readline is being used, the opportunity is given for 
reediting a failed history substitution. 


If readline is being used, the results of history substitution 
are not immediately passed to the shell parser. Instead, the 
resulting line is loaded into the readline editing buffer, 
allowing further modification. 


If readline is being used, an attempt will be made to 
perform hostname completion when a word beginning 
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huponexit 


inherit_errexit 


interactive_comments 


Lastpipe 


Lithist 


login_shell 


mailwarn 


no_empty_cmd_completion 


nocaseglob 


nocasematch 


nullglob 


progcomp 


promptvars 


with @ is being completed. 


bash will send a SIGHUP signal to all jobs when an 
interactive login shell exits. 


Command substitution inherits the value of the errexit 
option, instead of unsetting it in the subshell environment. 


Allows a word beginning with # and all subsequent 
characters on the line to be ignored in an interactive shell. 


If job control is not active, the shell runs the last command 
of a pipeline not executed in the background in the current 
shell environment. 


If the cmdhist option is enabled, multiline commands are 
saved to the history with embedded newlines rather than 
using semicolon separators where possible. 


bash was started as a login shell. This is a read-only value. 


If the file being checked for mail has been accessed since 
the last time it was checked, the message “The mail in 
mailfile has been read” is displayed. 


If readline is being used, no attempt will be made to 
search the $PATH for possible completions when 
completion is attempted on an empty line. 


bash matches filenames in a case-insensitive fashion when 
performing pathname expansion. 


bash matches patterns in a case-insensitive fashion when 
performing certain operations. 


Causes patterns that match no files to expand to null 
strings rather than to themselves. 


Programmable completion facilities are enabled. Default is 
on. 


Prompt strings undergo variable and parameter expansion 
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after being expanded. 


restricted_shell The shell was started in restricted mode. This value cannot 
be changed. 
shift_verbose The shift builtin will print an error if it has shifted past the 


last positional parameter. 


sourcepath The source builtin will use the value of $PATH to find the 
directory containing the file supplied as an argument. 


xpg_echo echo expands backslash escape sequences by default. 


Test Operators 


The operators in Table A-9 are used with fest and the [ ] and [[ ]] 
constructs. They can be logically combined with -a (“and”) and -o (“or” 
and grouped with escaped parentheses (\( \)). The string comparisons < 
and > and the [[ ]] construct are not available in versions of bash prior to 
2.0, and =~ is only available in bash version 3.0 and later, as noted. 


See Attp.//bit.ly/2wloyMK. 


Table A-9. Test operators 


Operator True if 

-a file True if file exists; same as -e 

-b file True if file exists and is a block device file 

-c file True if file exists and is a character device file 
-d file True if file exists and is a directory 

-e file True if file exists; same as -a 

-f file True if file exists and is a regular file 

-g file True if file exists and has its setgid bit set 
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-G file True if file exists and is owned by the effective group ID 


-h file True if file exists and is a symbolic link; same as -L 

-k file True if file exists and has its sticky bit set 

-L file True if file exists and is a symbolic link; same as -h 

-n string True if the length of string is nonzero 

-N file True if file was modified since it was last read 

-0 file True if file exists and is owned by the effective user ID 

-p file True if file exists and is a pipe or named pipe (FIFO file) 

-r file True if file exists and is readable 

-s file True if file exists and is not empty 

-S file True if file exists and is a socket 

-t fd True if file descriptor fd is open and refers to a terminal. 

-u file True if file exists and has its setuid bit set 

-w file True if file exists and is writable 

-x file True if file exists and is executable, or file is a directory that can be 
searched 

-z string True if the length of string is zero 

file1 -ef True if file1 and file2 refer to the same device and inode numbers 

file2 

file1 -nt True if file1’s modification date is newer than file2’s, or if file1 

file2 exists and file2 does not 

file1 -ot True if file1’s modification date is older than file2’s, or if file2 

file2 exists and file1 does not 

stringi = True if string1 equals string2 (POSIX version) 
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string2 


String! == True if string1 equals string2 

string2 

string! != True if the strings are not equal 

string2 

stringi < True if string1 sorts before string2 lexicographically 
string2 

stringi > True if string1 sorts after string2 lexicographically 
string2 

stringi =~ True if string1 matches the extended regular expression regexp? 
regexp 

exprA -eq True if arithmetic expressions exprA and exprB are equal 
exprB 

exprA -ne True if arithmetic expressions exprA and exprB are not equal 
exprB 

exprA -lt True if exprA is less than exprB 

exprB 

exprA -le True if exprA is less than or equal to exprB 

exprB 

exprA -gt True if exprA is greater than exprB 

exprB 

exprA -ge True if exprA is greater than or equal to exprB 

exprB 

exprA -a True if exprA is true and exprB is true 

exprB 

exprA -o True if exprA is true or exprB is true 

exprB 


@ Only available in bash version 3.0 and later. May only be used inside [[ ]]. 
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I/O Redirection 


Table A-10 is a complete list of I/O redirectors. Note that there are two 
formats for specifying STDOUT and STDERR redirection: &>fi le and 

>& file. The second of these (which is the one used throughout this book is 
the preferred way. 


See http://bit.ly/2xfqrqA. 


Table A-10. Input/output redirection 


Redirector Function 


cmd1 | cmd2 


cmd1 |& cmd2 


> file 

< file 
>> file 
>| file 
n>| file 
<> file 
&> file 


&>> file 


n<> file 
<< label 
<<< word 


n> file 


Pipe; send standard output of cmd1 as standard input to cma2. 


Pipe; send standard output and standard error of cmd1 as standard input 
to cmd2 (bash 4.0 or newer). 


Direct standard output to file. 

Take standard input from file. 

Direct standard output to file; append to file if it already exists. 
Force standard output to file even if noclobber is set. 

Force output to file from file descriptor n even if noclobber is set. 
Use file as both standard input and standard output. 

Direct standard output and standard error to file. 


Direct standard output and standard error to file; append to file if it 
already exists (bash 4.0 or newer). 


Use file as both input and output for file descriptor n. 
Here-document. 
Here-string. 


Direct file descriptor n to file. 
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n< file 
n>> file 
n>& 
n<& 
n>&m 
n<&m 
&>file 
<&- 

>&- 
n>&- 
n<&- 


n>&word 


n<&word 


n>&digit- 


n<&digit- 


Take file descriptor n from file. 

Direct file descriptor n to file; append to file if it already exists. 
Duplicate standard output to file descriptor n. 

Duplicate standard input from file descriptor n. 

File descriptor n is made to be a copy of the output file descriptor m. 
File descriptor n is made to be a copy of the input file descriptor m. 
Direct standard output and standard error to file. 

Close the standard input. 

Close the standard output. 

Close the output from file descriptor n. 

Close the input from file descriptor n. 


If nis not specified, the standard output (file descriptor 1) is used; if 
the digits in word do not specify a file descriptor open for output, a 
redirection error occurs. As a special case, if nis omitted, and word 
does not expand to one or more digits, the standard output and standard 
error are redirected as described previously. 


If word expands to one or more digits, the file descriptor denoted by n 
is made to be a copy of that file descriptor; if the digits in word do not 
specify a file descriptor open for input, a redirection error occurs. If 
word evaluates to -, file descriptor n is closed; if n is not specified, the 
standard input (file descriptor 0) is used. 


Moves the file descriptor digit to file descriptor n, or the standard 
output (file descriptor 1) if n is not specified. 


Moves the file descriptor digit to file descriptor n, or the standard 
input (file descriptor 0) if n is not specified; digit is closed after being 
duplicated to n. 
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echo Options and Escape Sequences 


echo accepts three arguments (see Table A-11). 


Table A-11. echo options 


Option Function 
-e Turns on the interpretation of backslash-escaped characters 


-E Turns off the interpretation of backslash-escaped characters on systems 
where this mode is the default 


-n Omits the final newline (same as the \c escape sequence; see Table A-12) 


echo also accepts a number of escape sequences that start with a backslash 
(Table A-12). These sequences exhibit fairly predictable behavior, except for 
\f, which on some displays causes a screen clear while on others it causes a 
line feed, and it ejects the page on most printers. \v is somewhat obsolete; it 
usually causes a line feed. 


See http://bit.ly/2ii87KI. 


Table A-12. echo escape sequences 


Sequence Character printed 


\a Alert or Ctrl-G (bell) 

\b Backspace or Ctrl-H 

\c Suppress further output 

\e Escape character (same as \E) 

\E Escape character 

AE Form feed or Ctrl-L 

\n Newline (not at end of command) or Ctrl-J 
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\Onnn 


\ann 


\xHH 


\ UHHHH 


\UHHHHHHHH 


Return (Enter) or Ctrl-M 
Tab or Ctrl-I 

Vertical tab or Ctrl-K 
Single backslash 


The eight-bit character whose value is the octal (base-8) value nnn (zero 
to three octal digits) 


Same as \@nnn 


The eight-bit character whose value is the hexadecimal (base-16) value 
HH (one or two hex digits) 


The Unicode (ISO/IEC 10646) character whose value is the hexadecimal 
value HHHH (one to four hex digits) 


The Unicode (ISO/IEC 10646) character whose value is the hexadecimal 
value HHHHHHHH (one to eight hex digits) 


The \n, \0, and \x sequences are even more device-dependent and can be 
used for complex I/O, such as cursor control and special graphics characters. 


printi 


The printf command, available in bash since version 2.02, has three parts 
(beyond the command name): 


printf [-v var] format-string [arguments] 


-v var is optional and causes the output to be assigned to the variable var 
rather than being printed to standard output. 


format-string describes the format specifications; this is best supplied as a 


string constant in quotes. arguments is a list, such as a list of strings or 
variable values that correspond to the format specifications. 
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The format is reused as necessary to use up all of the arguments. If the format 
requires more arguments than are supplied, the extra format specifications 
behave as if a zero value or null string, as appropriate, had been supplied. 


A format specification is preceded by a percent sign (%, and the specifier is 
one of the characters described in Table A-13. Two of the main format 
specifiers are %s for strings and %d for decimal integers. 


See Attp://bit.ly/2uUYkjy. 


Format 
character 


%b 


%C 


%d, %i 


%e 


%E 


%f 


%g 


%G 


%O 


%q 


%S 


%u 


Table A-13. printf format specifiers 


Meaning 

Causes printf to expand backslash escape sequences in the 
corresponding argument in the same way as echo -e 

ASCII character (prints first character of corresponding argument) 
Decimal (base 10) integer 


Floating-point format [-]d.precisione[+-]dd)—-see the text after the 
table for the meaning of precision 


Floating-point format ([-]d.precisionE[+-]dd) 

Floating-point format ([ -]ddd. precision) 

%e or %f conversion, whichever is shorter, with trailing zeros removed 
%E or %f conversion, whichever is shortest, with trailing zeros removed 
Unsigned octal value 


Causes printf to output the corresponding argument in a format that can 
be reused as shell input 


String 


Unsigned decimal value 
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%x Unsigned hexadecimal number; uses a-f for 10 to 15 


%X Unsigned hexadecimal number; uses A-F for 10 to 15 
%% Literal % 
% Causes printf to output the date-time string resulting from using datefmt 


(datefmt)T asa format string for strftime (bash 4.2 or newer) 


The printf command can be used to specify the width and alignment of output 
fields. A format expression can take three optional modifiers following the % 
and preceding the format specifier: 


%<f lags<>width.precision<> format-specifier> 


The width of the output field is a numeric value. When you specify a field 
width, the contents of the field are right-justified by default. You must 
specify a flag of - to get left-justification (the rest of the flags are shown in 
the table). Thus, %- 20s outputs a left-justified string in a field 20 characters 
wide. If the string is less than 20 characters, the field is padded with 
whitespace to fill it. In the following examples, we put our format specifier 
between a pair of | in our format string so you can see the width of the field 
in the output. The first example right-justifies the text: 


printf "|%10s|\n" hello@ 

It produces: 
l hello| 

The next example left-justifies the text: 
printf "|%-10s|\n" hello 


It produces: 
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| hello l 


The precision modifier, used for decimal or floating-point values, controls the 
number of digits that appear in the result. For string values, it controls the 
maximum number of characters from the string that will be printed. 


You can even specify both the width and precision dynamically, via values in 
the printf argument list. You do this by specifying asterisks in the format 
expression, instead of literal values: 


$ myvar=42.123456 


$ mysig=6 

$ printf "[%*.*G|\n" 5 Smysig Smyvar 
|42.1235| 

$ 


In this example, the width is 5, the precision is 6, and the value to print comes 
from the value of $myvar. The precision is optional and its exact meaning 
varies by control letter, as shown in Table A-14. 

Table A-14. Meaning of “precision” based on printf format specifier 


Format What “precision” means 


%d, %I1, The minimum number of digits to print. When the value has fewer digits, it 
%o, %u, is padded with leading zeros. The default precision is 1. 
%X, %X 


%e, %E The minimum number of digits to print. When the value has fewer digits, it 
is padded with zeros after the decimal point. The default precision is 10. A 
precision of 0 inhibits printing of the decimal point. 


%F The number of digits to the right of the decimal point. 
%g, %G The maximum number of significant digits. 
%S The maximum number of characters to print. 


%b [POSIX shell—may be nonportable to other versions of printf.] When used 
instead of %s, expands echo-style escape sequences in the argument string 
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(see Table A-15). 


%q [POSIX shell—may be nonportable to other versions of printf.] When used 
instead of %s, prints the string argument in such a way that it can be used for 
shell input. 


%b and %q are additions to bash (and other POSIX-compliant shells) that 
provide useful features at the expense of nonportability to versions of the 
printf command found in some other shells and in other places in Unix. Here 
are two examples to make their functions a little clearer. 


%q shell quotes: 


$ printf "%q\n" "greetings to the world" 
greetings\ to\ the\ world 
$ 


%b echo-style escapes: 


$ printf "%s\n" "hello\nworld' 
hello\nworld 

$ printf "%b\n" '"hello\nworld' 
hello 

world 


$ 


Table A-15 shows the escape sequences that will be translated in a string 
printed with the %b format. 


Table A-15. printf escape sequences 


Escape Meaning 

sequence 

\e Escape character 

\a Bell character 

\b Backspace character 
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Form feed character 
Newline character 
Carriage return character 
Tab character 

Vertical tab character 
Single-quote character 
Double-quote character 
Backslash character 


8-bit character whose ASCII value is the 1-, 2-, or 3-digit octal 
number nnn 


8-bit character whose ASCII value is the 1- or 2-digit hexadecimal 
number HH 


Finally, one or more flags may precede the field width and the precision in a 
printf format specifier. We’ve already seen the - flag for left-justification. 
The rest of the flags are shown in Table A-16. 


Table A-16. printf flags 


Character Description 


Space 


Left-justify the formatted value within the field. 
Prefix positive values with a space and negative values with a minus. 
Always prefix numeric values with a sign, even if the value is positive. 


Use an alternate form: %o has a preceding 0; %x and %X are prefixed with 
Ox and 0X, respectively; %e, %E, and %f always have a decimal point in the 
result; and %g and %G do not have trailing zeros removed. 


Pad output with zeros, not spaces. This only happens when the field 
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width is wider than the converted result. In the C language, this flag 
applies to all output formats, even nonnumeric ones. For bash, it only 
applies to the numeric formats. 


Format with thousands grouping characters if the format specification 


includes %i, %d, %u, %f, %F, %g, or %G (although this is POSIX, it’s still not 
always implemented). 


Examples 


These examples for printf use some shell variables, assigned as follows in 


Table A-17: 


PT=3.141592653589 


Table A-17. printf examples 


printf Result 
statement 

printf '%f\n' 3.141593 
SPI 


# not what you 3.141593.5 
want printf 
'%f.5\n' SPI 


printf '%.5f\n' 3.14159 
SPI 


printf '%+.2f\n' +3.14 
SPI 


printf [s] 
[%.4s]\n' s [stri] 
string 


printf '[%4s]\n' [ s] 


s string [string] 


Comment 


Note the default rounding. 


A common mistake—the format specifier should be 
on the other side of the %f; since it isn’t, the .5 is just 
appended like any text. 


Gives five places to the right of the decimal point. 
Leading + sign, only two digits to the right of the 
decimal point. 

Truncates to four characters; with only one character, 
we get only one-character-wide output. Note reuse of 


format string. 


Assures us of a minimum four-character field width, 
right-justified; doesn’t truncate, though. 
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printf [s ] Does it all—minimum width of four, maximum width 
'[%-4.4s5]\n' s etry of four, truncating if necessary, and left justifies (due 
string to the minus sign) shorter than four. 


Here is one more example that will not display well in the table. The 
traditional way to write printf statements is to embed all formatting, 
including things like newlines, in the format string. This is shown in the 
table. That is encouraged, but you don’t have to do it that way, and 
sometimes it’s easier if you don’t. Note the — denotes a tab character in the 
output of this example: 


$ printf "%b" "\aRing terminal bell, then tab\t then newline\nThen Line 
2.\n" 

Ring terminal bell, then tab > then newline 

Then line 2. 


Finally, we really like the %(datefmt)T introduced in bash 4.2 because you 
can now often eliminate a subshell call to date, like so: 


$ printf "%(%F)T\n" '-1' 
2017-02-06 


$ printf "%(%F_%T)T\n" 
2017-02-06 20:31:25 


$ printf "%(%F_%T%z)T: %s\n" '-1' 'Your log line here...' 
2017-02-06_20:32:24-0500: Your log line here... 


WARNING | 
It took a few bash releases for %( datefmt)T behavior to stabilize! Sometimes 
you can omit the -1 argument and sometimes you can’t, depending on which 


version of bash you have and what you’re doing. For maximum portability, do 
| not omit the argument. 


See Also 
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= /Attp://wiki.bash-hackers.org/commands/builtin/printf 


a /ttp://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf. html 


Date and Time String Formatting with strftime 


Table A-18 shows common date and time string formatting options. Consult 
your system’s manpages for date and strftime(3), as both the options and 
what they mean vary from system to system. 


Table A-18. strftime format codes 


Format Description 


%% 


%a 


%A 


%B 


%b 


%C 


%C 


%d 


%D 


%e 


%F 


A literal %. 

The locale’s abbreviated weekday name (Sun..Sat). 

The locale’s full weekday name (Sunday..Saturday). 

The locale’s full month name (January..December). 

The locale’s abbreviated month name (Jan..Dec). See also %h. 
The locale’s default/preferred date and time representation. 


The century (a year divided by 100 and truncated to an integer) as a decimal 
number (00..99). 


The day of the month as a decimal number (01..31). 


The date in the format %m/%d/%y (MM/DD/YY). Note that the United States 
uses MM/DD/YY while everyone else uses DD/MM/YY, so this format is 
ambiguous and should be avoided. Use %F instead, since it’s a recognized 
standard and it sorts well. 


The day of the month as a blank-padded decimal number ( 1..31). 


The date in the format %Y-%m-%d (the ISO 8601 date format CCY Y-MM-DD, 
except when it’s the full month name, as on HP-UX). 
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%g 


%G 


%H 


%h 


%I 


%j 


%k 


%L 


%m 


%M 


%n 


%N 


%p 


%P 


%r 


%R 


%S 


%S 


%t 


%T 


%u 


The two-digit year corresponding to the %v week number (YY). 

The four-digit year corresponding to the %V week number (CCY Y). 
The hour (24-hour clock) as a decimal number (00..23). 

The locale’s abbreviated month name (Jan..Dec). See also %b. 

The hour (12-hour clock) as a decimal number (01..12). 

The day of the year as a decimal number (001..366). 

The hour (24-hour clock) as a blank-padded decimal number ( 0..23). 
The hour (12-hour clock) as a blank-padded decimal number ( 1..12). 
The month as a decimal number (01..12). 

The minute as a decimal number (00..59). 

A literal newline. 

Nanoseconds (000000000..999999999). [GNU] 

The locale’s equivalent of either “AM” or “PM”. 

The locale’s equivalent of either “am” or “pm”. [GNU] 


The locale’s representation of 12-hour clock time using AM/PM notation 
(HH:MM:SS AM/PM). 


The time in the format %H:%M (HH:MM). 
The number of seconds since the epoch, UTC (January 1, 1970 at 00:00:00). 


The second as a decimal number (00..61). The range of seconds is (00..61) 
instead of (00..59) to allow for the periodic occurrence of leap seconds and 
double leap seconds. 


A literal tab. 
The time in the format %H:%M:%S (HH:MM:SS). 


The weekday (Monday as the first day of the week) as a decimal number 
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%U 


%v 


%V 


%w 


%W 


%x 


%X 


%y 


%Y 


KZ 


%Z 


al ay 


The week number of the year (Sunday as the first day of the week) as a 
decimal number (00..53). 


The date in the format %e-%b-%Y (D-MMM-CCYY). [Not standard] 


The week number of the year (Monday as the first day of the week) as a 
decimal number (01..53). According to ISO 8601 the week containing 
January 1 is week 1 if it has four or more days in the new year; otherwise it 
is week 53 of the previous year, and the next week is week 1. The year is 
given by the %G conversion specification. 


The weekday (Sunday as the first day of the week) as a decimal number 
(0..6). 


The week number of the year (Monday as the first day of the week) as a 
decimal number (00..53). 


The locale’s appropriate date representation. 

The locale’s appropriate time representation. 

The year without the century as a decimal number (00..99). 
The year with the century as a decimal number. 

The offset from UTC in the ISO 8601 format [ - ]hhmn. 


The time zone name. 


Pattern-Matching Characters 


Table A-19 lists the pattern-matching characters in bash. The material in this 
section is adapted from the Bash Reference Manual. 


Table A-19. Pattern-matching characters 
Character Meaning 


* Matches any string, including the null string 
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Matches any single character 
[] Matches any one of the enclosed characters 


[!] or [^] Matches any character not enclosed 


The following POSIX character classes may be used within brackets, as 
shown here (consult the grep or egrep manpage on your system for more 
details): 


[[:alnum:]] [[:alpha:]] [[:ascii:]] [[:blank:]] [[:cntrl:]] [[:digit:]] 
[[:graph:]] [[:lower:]] [[:print:]] [[:punct:]] [[:space:]] [[:upper: ]] 
[[:word:]] [[:xdigit:]] 

The word character class matches letters, digits, and the character _. 


[=c=] matches all characters with the same collation weight (as defined by 
the current locale) as the character c, while [. symbol. ] matches the collating 
symbol symbol. 


These character classes are affected by the locale setting. To get the 
traditional Unix values, use LC_COLLATE=C or LC_ALL=C. 


extglob Extended Pattern-Matching Operators 


The operators in Table A-20 apply when using shopt -s extglob. Matches 
are case-sensitive, but you may use shopt -s nocasematch (bash 3.1+) to 
change that. This option affects case and [[ commands. 


Table A-20. extglob extended pattern- 
matching operators 


Grouping Meaning 
@) Only one occurrence 


-0-() Zero or more occurrences 
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-0-() One or more occurrences 
20) Zero or one occurrence 


1() No occurrences of this, but anything else 


tr Escape Sequences 


Table A-21 lists the tr escape sequences. 


Table A-21. tr escape sequences 


Sequence Meaning 


\000 Character with octal value ooo (one to three octal digits) 

\\ A backslash character (i.e., escapes the backslash itself) 

\a “Audible” bell, the ASCII BEL character (since b was taken for 
backspace) 

\b Backspace 

\f Form feed 

\n Newline 

\r Return 

ME Tab (sometimes called a horizontal tab) 

\v Vertical tab 


readline Init File Syntax 


The GNU readline library provides the command line on which you type to 
communicate with bash and some other GNU utilities. It is amazingly 
configurable, but most people are not aware of this. 


Tables A-22, A-23, and A-24 are a subset of what is available to work with. 
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See the readline documentation for the full details. 
The following is adapted directly from Chet Ramey’s documentation. 


You can modify the runtime behavior of readline by altering the values of 
variables in readline using the set command within the init file. The syntax is 
simple: 


set variable value 


Here, for example, is how to change from the default Emacs-like key binding 
to use vi line-editing commands: 


set editing-mode vi 


Variable names and values, where appropriate, are recognized without regard 
to case. Unrecognized variable names are ignored. 


Boolean variables (those that can be set to on or off) are set to on if the value 
is null or empty, on (case-insensitive), or 1. Any other value results in the 
variable being set to off. 


Also, the mode string variables are only displayed when the show-mode-in- 
prompt readline variable is enabled (see also Table A-7). 


See Attp://bit.ly/2w1B9iV. 


Table A-22. readline configuration settings 


Variable Description 


bell-style | Controls what happens when readline wants to ring the terminal bell. 
If set to none, readline never rings the bell. If set to visible, readline 
uses a visible bell if one is available. If set to audible (the default), 
readline attempts to ring the terminal’s bell. 


bind-tty- If set to on, readline attempts to bind the control characters treated 
special- specially by the kernel’s terminal driver to their readline equivalents. 
chars The default is on. 
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blink- 
matching- 
paren 


colored- 
completion- 
prefix 


colored- 
stats 


comment- 
begin 


completion- 
display- 
width 


completion- 
ignore-case 


completion- 
map-case 


completion- 
prefix- 
display- 
length 


completion- 
query-items 


convert-meta 


If set to on, readline attempts to briefly move the cursor to an opening 
parenthesis when a closing parenthesis is inserted. The default is off. 


If set to on, when listing completions, readline displays the common 
prefix of the set of possible completions using a different color. The 
color definitions are taken from the value of the LS_COLORS 
environment variable. The default is off. 


If set to on, readline displays possible completions using different 
colors to indicate their file type. The color definitions are taken from 
the value of the LS_COLORS environment variable. The default is off. 


The string to insert at the beginning of the line when the insert- 
comment command is executed. The default value is #. 


The number of screen columns used to display possible matches when 
performing completion. The value is ignored if it is less than zero or 
greater than the terminal screen width. A value of 0 will cause matches 
to be displayed one per line. The default value is -1. 


If set to on, readline performs filename matching and completion in a 
case-insensitive fashion. The default is off. 


If set to on and completion-ignore-case is enabled, readline treats 
hyphens (-) and underscores (_) as equivalent when performing case- 
insensitive filename matching and completion. 


The length in characters of the common prefix of a list of possible 
completions that is displayed without modification. When set to a 
value greater than zero, common prefixes longer than this value are 
replaced with an ellipsis when displaying possible completions. 


The number of possible completions that determines when the user is 
asked whether the list of possibilities should be displayed. If the 
number of possible completions is greater than this value, readline will 
ask the user whether to display them; otherwise, they are simply listed. 
This variable must be set to an integer value greater than or equal to 
zero. A negative value means readline should never ask. The default 
limit is 100. 


If set to on, readline will convert characters with the eighth bit set to an 


795 


disable- 
compLetion 


echo- 
control- 
characters 


editing-mode 


emacs -mode- 
string 


enable- 
bracketed- 
paste 


enable- 
keypad 


enable-meta- 
key 


expand-tilde 


ASCII key sequence by stripping the eighth bit and prefixing an Esc 
character, converting them to a meta-prefixed key sequence. The 
default is on. 


If set to on, readline will inhibit word completion. Completion 
characters will be inserted into the line as if they had been mapped to 
self-insert. The default is off. 


When set to on, on operating systems that indicate they support it, 
readline echoes a character corresponding to a signal generated from 
the keyboard. The default is on. 


Controls which default set of key bindings is used. By default, readline 
starts up in Emacs editing mode, where the keystrokes are most similar 
to Emacs. This variable can be set to either emacs or vi. 


The string to display immediately before the last line of the primary 
prompt when Emacs editing mode is active. The value is expanded like 
a key binding, so the standard set of meta- and control prefixes and 
backslash escape sequences is available. Use the \1 and \2 escapes to 
begin and end sequences of nonprinting characters, which can be used 
to embed a terminal control sequence into the mode string. The default 


is @. 


When set to on, readline will configure the terminal in a way that will 
enable it to insert each paste into the editing buffer as a single string of 
characters, instead of treating each character as if it had been read from 
the keyboard. This can prevent pasted characters from being 
interpreted as editing commands. The default is off. 


When set to on, readline will try to enable the application keypad when 
it is called. Some systems need this to enable the arrow keys. The 
default is of f. 


When set to on, readline will try to enable any meta modifier key the 
terminal claims to support when it is called. On many terminals, the 
meta key is used to send eight-bit characters. The default is on. 


If set to on, tilde (~) expansion is performed when readline attempts 
word completion. The default is off. 
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history- 
preserve- 
point 


history-size 


horizontal- 
scroll-mode 


input-meta 


isearch- 
terminators 


keymap 


keyseq- 
timeout 


If set to on, the history code attempts to place the point (the current 
cursor position) at the same location on each history line retrieved with 
previous-history or next-history. The default is off. 


The maximum number of history entries to save in the history list. If 
set to 0, any existing history entries are deleted and no new entries are 
saved. If set to a value less than zero, the number of history entries is 
not limited. By default, the number of history entries is not limited. If 
an attempt is made to set history-size to a nonnumeric value, the 
maximum number of history entries will be set to 500. 


If set to on, the text of the lines being edited will scroll horizontally on 
a single screen line when they are longer than the width of the screen, 
instead of wrapping onto a new screen line. The default is off. 


If set to on, readline will enable eight-bit input (it will not clear the 
eighth bit in the characters it reads), regardless of what the terminal 
claims it can support. The default is off. The name meta-flag is a 
synonym for this variable. 


The string of characters that should terminate an incremental search 
without subsequently executing the character as a command. If this 
variable has not been given a value, the characters Esc and C-J will 
terminate an incremental search. 


Sets readline’s idea of the current keymap for key binding commands. 
Acceptable keymap names are emacs, emacs-standard, emacs -meta, 
emacs-ctlx, vi, vi-move, vi-command, and vi-insert. vi is equivalent to 
vi-command; emacs is equivalent to emacs-standard. The default value is 
emacs. The value of the editing-mode variable also affects the default 
keymap. 


Specifies the duration readline will wait for a character when reading 
an ambiguous key sequence (one that can form a complete key 
sequence using the input read so far, or can take additional input to 
complete a longer key sequence). If no input is received within the 
timeout, readline will use the shorter but complete key sequence. 
readline uses this value to determine whether or not input is available 
on the current input source (rl_instream by default). The value is 
specified in milliseconds, so a value of 1000 means that readline will 
wait one second for additional input. If this variable is set to a value 
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mark- 
directories 


mark- 
modified- 
lines 


mark- 
symlinked- 
directories 


match- 
hidden-files 


menu- 
complete- 
display- 
prefix 


output-meta 


page- 
completions 


print- 
completions- 


horizontally 


revert-all- 
at-newline 


show-all-if- 


less than or equal to zero, or to a nonnumeric value, readline will wait 
until another key is pressed to decide which key sequence to complete. 
The default value is 500. 


If set to on, completed directory names have a slash appended. The 
default is on. 


If set to on, readline will display an asterisk (*) at the start of history 
lines that have been modified. The default is off. 


If set to on, completed names that are symbolic links to directories 
have a slash appended (subject to the value of mark-directories). The 
default is off. 


If set to on, readline will match files whose names begin with a dot 
(hidden files) when performing filename completion, unless the 
leading . is supplied by the user in the filename to be completed. The 
default is on. 


If set to on, menu completion displays the common prefix of the list of 
possible completions (which may be empty) before cycling through the 
list. The default is off. 


If set to on, readline will display characters with the eighth bit set 
directly rather than as a meta-prefixed escape sequence. The default is 
off. 


If set to on, readline uses an internal more-like pager to display a 
screenful of possible completions at a time. The default is on. 


If set to on, readline will display completions with matches sorted 
horizontally in alphabetical order, rather than down the screen. The 
default is off. 


If set to on, readline will undo all changes to history lines before 
returning when accept-line is executed. By default, history lines may 
be modified and retain individual undo lists across calls to readline. 
The default is off. 


Alters the default behavior of the completion functions. If set to on, 


798 


ambiguous 


show-alL-if- 
unmodified 


show-mode- 
in-prompt 


skip- 
completed- 
text 


vi-cmd-mode- 
string 


vi-ins-mode- 
string 


visible- 


words that have more than one possible completion cause the matches 
to be listed immediately instead of ringing the bell. The default is off. 


Alters the default behavior of the completion functions in a fashion 
similar to show-all-if-ambiguous. If set to on, words that have more 
than one possible completion without any possible partial completion 
(the possible completions don’t share a common prefix) cause the 
matches to be listed immediately instead of ringing the bell. The 
default is off. 


If set to on, a character is added to the beginning of the prompt 
indicating the editing mode: emacs, vi-command, or vi-insert. The 
mode strings are user-settable. The default is off. 


If set to on, this alters the default completion behavior when inserting a 
single match into the line. It’s only active when performing completion 
in the middle of a word. If enabled, readline does not insert characters 
from the completion that match characters after point in the word 
being completed, so portions of the word following the cursor are not 
duplicated. For instance, if this is enabled, attempting completion 
when the cursor is after the first e in Makefile will result in Makefile 
rather than Makefilefile, assuming there is a single possible 
completion. The default is off. 


The string to display immediately before the last line of the primary 
prompt when vi editing mode is active and in command mode. The 
value is expanded like a key binding, so the standard set of meta and 
control prefixes and backslash escape sequences is available. Use the 
\1 and \2 escapes to begin and end sequences of nonprinting 
characters, which can be used to embed a terminal control sequence 
into the mode string. The default is (cmd). 


The string to display immediately before the last line of the primary 
prompt when vi editing mode is active and in insertion mode. The 
value is expanded like a key binding, so the standard set of meta and 
control prefixes and backslash escape sequences is available. Use the 
\1 and \2 escapes to begin and end sequences of nonprinting 
characters, which can be used to embed a terminal control sequence 
into the mode string. The default is (ins). 


If set to on, a character denoting a file’s type is appended to the 
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stats 


filename when listing possible completions. The default is of f. 


Emacs Mode Commands 


The material in this section also appears in Learning the bash Shell, 3rd 
Edition, by Cameron Newham (O’Reilly). 


Table A-23 is a complete list of readline Emacs editing mode commands. 


Command 


Ctrl-A 


Ctrl-B 


Ctrl-D 


Ctrl-E 


Ctrl-F 


Ctrl-G 


Ctrl-J 


Ctrl-K 


Ctrl-L 


Ctrl-M 


Ctrl-N 


Ctrl-O 


Ctrl-P 


Ctrl-R 


Ctrl-S 


Table A-23. Emacs mode commands 
Meaning 
Move to beginning of line. 
Move backward one character. 
Delete one character forward. 
Move to end of line. 
Move forward one character. 
Abort the current editing command and ring the terminal bell. 
Same as Return. 
Delete (kill) forward to end of line. 
Clear screen and redisplay the line. 
Same as Return. 
Next line in command history. 
Same as Return, then display next line in history file. 
Previous line in command history. 
Search backward. 


Search forward. 
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Ctrl-T 
Ctrl-U 
Ctrl-V 
Ctrl-V Tab 
Ctrl-W 
Ctrl-X/ 
Ctrl-X~ 
Ctrl-X $ 
Ctrl-X@ 
Ctrl-X! 
Ctrl-X( 
Ctrl-X) 
Ctrl-Xe 


Ctrl-X 
Ctrl-R 


Ctrl-X 
Ctrl-V 


Ctrl-Y 
Delete 
Ctrl-[ 

Esc-B 
Esc-C 
Esc-D 


Esc-F 


Transpose two characters. 

Kill backward from point to the beginning of line. 

Make the next character typed verbatim. 

Insert a tab. 

Kill the word behind the cursor, using whitespace as the boundary. 
List the possible filename completions of the current word. 

List the possible username completions of the current word. 

List the possible shell variable completions of the current word. 
List the possible hostname completions of the current word. 

List the possible command name completions of the current word. 
Begin saving characters into the current keyboard macro. 

Stop saving characters into the current keyboard macro. 
Reexecute the last keyboard macro defined. 


Read in the contents of the readline initialization file. 


Display version information on this instance of bash. 


Retrieve (yank) last item killed. 

Delete one character backward. 

Same as Esc (most keyboards). 

Move one word backward. 

Change word after point to all capital letters. 
Delete one word forward. 


Move one word forward. 


801 


Esc-L Change word after point to all lowercase letters. 


Esc-N Nonincremental forward search. 

Esc-P Nonincremental reverse search. 

Esc-R Undo all the changes made to this line. 

Esc-T Transpose two words. 

Esc-U Change word after point to all uppercase letters. 


Esc-Ctrl-E Perform shell alias, history, and word expansion on the line. 
Esc-Ctrl-H Delete one word backward. 


Esc-Ctrl-Y Insert the first argument to the previous command (usually the second 
word) at point. 


Esc-Delete Delete one word backward. 


Esc-* Perform history expansion on the line. 

Esc-< Move to first line of history file. 

Esc-> Move to last line of history file. 

Esc-. Insert last word in previous command line after point. 
Esc-_ Same as above. 

Tab Attempt filename completion on current word. 

Esc-? List the possible completions of the text before point. 
Esc-/ Attempt filename completion on current word. 

Esc-~ Attempt username completion on current word. 

Esc-$ Attempt variable completion on current word. 

Esc-@ Attempt hostname completion on current word. 

Esc-! Attempt command name completion on current word. 
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Esc-Tab Attempt completion from text in the command history. 


Esc-~ Attempt tilde expansion on the current word. 

Esc-\ Delete all the spaces and tabs around point. 

Esc-* Insert all of the completions that would be generated by Esc-= before 
point. 

Esc-= List the possible completions before point. 

Esc-{ Attempt filename completion and return the list to the shell enclosed 


within braces. 


vi Control Mode Commands 


The material in this section also appears in Learning the bash Shell, 3rd 
Edition, by Cameron Newham (O’Reilly). 


Table A-24 shows a complete list of readline vi control mode commands. 


Table A-24. vi mode commands 


Command Meaning 


h Move left one character. 

l Move right one character. 

w Move right one word. 

b Move left one word. 

W Move to beginning of next nonblank word. 

B Move to beginning of preceding nonblank word. 
e Move to end of current word. 

E Move to end of current nonblank word. 

0 Move to beginning of line. 
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cc 


Repeat the last a insertion. 

Move to first nonblank character in line. 

Move to end of line. 

Insert text before current character. 

Insert text after current character. 

Insert text at beginning of line. 

Insert text at end of line. 

Overwrite existing text. 

Delete one character backward. 

Delete one character forward. 

Delete one word backward. 

Delete one word forward. 

Delete one nonblank word backward. 

Delete one nonblank word forward. 

Delete to end of line. 

Delete to beginning of line. 

Equivalent to d$ (delete to end of line). 
Equivalent to @d$ (delete entire line). 
Equivalent to c$ (delete to end of line, enter input mode). 
Equivalent to 0c$ (delete entire line, enter input mode). 
Equivalent to dl (delete one character forward). 


Equivalent to dh (delete one character backward). 
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/string 


?string 


Ctrl-L 


Move backward one line. 

Move forward one line. 

Move to line given by repeat count. 
Search forward for string. 

Search backward for string. 

Repeat search forward. 

Repeat search backward. 

Move right to next occurrence of x. 
Move left to previous occurrence of x. 


Move right to next occurrence of x, then back one space. 


Move left to previous occurrence of x, then forward one space. 
Redo last character finding command. 

Redo last character finding command in opposite direction. 
Do filename completion. 

Do wildcard expansion (onto command line). 

Do wildcard expansion (as printed list). 

Invert (twiddle) case of current character(s). 

Append last word of previous command, enter input mode. 
Start a new line and redraw the current line on it. 


Prepend # (comment character) to the line and send it to history. 


Table of ASCII Values 
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Many of our favorite computer books have an ASCII chart (Table A-25. 
Even in the era of GUIs and web servers, you may be surprised to find that 
you still need to look up a character every now and then. It’s certainly useful 
when working with tr or finding some special sequence of escape characters. 


Table A-25. ASCII 
values 


Int Octal Hex ASCII 
0 000 00 ^@ 
1 001 01 ^A 
2 002 02 ^B 
3 003 03 “C 
4 004 04 ^D 
5 005 OSMRAE 
6 006 06 ^F 
7 007 07 ^G 
8 010 08 ^H 
9 011 09 ^I 
10 012 0a ^J 
11 013 Ob ^K 
12 014 Oc ^L 
13 015 Od ^M 
14 016 Oe ^N 
15 017 of 0 


16 020 10 ^P 
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17 


18 


19 


20 


21 


22 


23 


24 


25 


26 


27 


28 


29 


30 


31 


32 


33 


34 


35 


36 


37 


38 


39 


046 


047 


11 
12 
13 
14 
15 
16 
17 
18 
19 
la 
1b 
1c 
1d 
1e 
1f 
20 
21 
22 
23 
24 
25 
26 


27 
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^Q 


^R 


AN 


40 


41 


42 


43 


44 


45 


46 


47 


48 


49 


50 


51 


52 


53 


54 


55 


56 


57 


58 


59 


60 


61 


072 


073 


074 


075 


28 
29 
2a 
2b 
2c 
2d 
2e 
2f 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
3a 
3b 
3c 


3d 


808 


62 


63 


64 


65 


66 


67 


68 


69 


70 


71 


72 


73 


74 


75 


76 


77 


78 


79 


80 


81 


82 


83 


84 


076 


077 


100 


101 


102 


103 


104 


105 


106 


107 


110 


111 


112 


113 


114 


115 


116 


117 


120 


121 


122 


123 


124 


3e 
3f 
40 
41 
42 
43 
44 
45 
46 
47 
48 
49 
4a 
4b 
4c 
4d 
4e 
4f 
50 
51 
52 
53 


54 
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85 


86 


87 


88 


89 


90 


91 


92 


93 


94 


95 


96 


97 


98 


99 


100 


101 


102 


103 


104 


105 


106 


125 


126 


127 


130 


131 


132 


133 


134 


135 


136 


137 


140 


141 


142 


143 


144 


145 


146 


147 


150 


151 


152 


55 
56 
57 
58 
59 
5a 
5b 
5c 
5d 
5e 
5f 
60 
61 
62 
63 
64 
65 
66 
67 
68 
69 


6a 


810 


107 


108 


109 


110 


111 


112 


113 


114 


115 


116 


117 


118 


119 


120 


121 


122 


123 


124 


125 


126 


127 


153 


154 


155 


156 


157 


160 


161 


162 


163 


164 


165 


166 


167 


170 


171 


172 


173 


174 


175 


176 


177 


6b 
6c 
6d 
6e 
6f 
70 
71 
72 
73 
74 
75 
76 
77 
78 
79 
Ta 
7b 
TC 
7d 
7e 


7f 
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Appendix B. Examples Included 
with bash 


The bash tarball includes a lot of material that is well worth exploring (after 
you ve finished reading this book, of course). It includes sample code and 
examples, scripts, functions, and startup files. The easiest way to access this 
material is via our up-to-date list with hyperlinks, but we’re including this 
appendix to provide a taste of what you'll find since few people actually 
access the tarball or build from source anymore. 


bash Documentation and Examples 


The startup-files directory provides many examples of what you can put in 
your own startup files. In particular, bash_aliases has many useful aliases. 
Bear in mind that if you copy these files wholesale, you’ ll have to edit them 
for your system because many of the paths will be different. Refer to 
Chapter 16 for further information on changing these files to suit your needs. 


The functions directory contains many function definitions that you might 
find useful. Among them are: 


basename 


The basename utility, missing from some systems. 


dirfuncs 


Directory manipulation facilities. 


dirname 


The dirname utility, missing from some systems. 


whatis 


An implementation of the Tenth Edition Bourne shell whatis builtin. 
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whence 


An almost exact clone of the Korn shell whence builtin. 


If you come from a Korn shell background, you may find kshenv especially 
helpful. This contains function definitions for some common Korn facilities 
such as whence, print, and the two-parameter cd builtins. 


The scripts directory contains many examples of bash scripts. The two 
largest scripts are examples of the complex things you can do with shell 
scripts. The first is a (rather amusing) adventure game interpreter and the 
second is a C shell interpreter. The other scripts include examples of 
precedence rules, a scrolling text display, a “spinning wheel” progress 
display, and how to prompt the user for a particular type of answer. 


Not only are the script and function examples useful for including in your 
environment, but they also provide many alternative examples that you can 
learn from in addition to reading this book. We encourage you to experiment 
with them. 


Table B-1 is an index of what you will find as of bash 4.2. 


NOTE 


At the request of the Free Software Foundation, Chet removed some examples 
from recent versions of bash because there is some question about the 
provenance of the code. If an example you are interested in is missing, look for 
it in older releases or check hitp://www.bashcookbook.com/bashinfo/. 


Table B-1. bash 4.2 documentation and examples 


Path Description 

/bash/ABOUT-NLS Notes on the Free Translation Project 
/bash/AUTHORS Master author manifest for bash 
/bash/CHANGES DETAILED changes between versions 
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/bash/COMPAT 


./bash/ COPYING 


./bash/INSTALL 


./bash/' MANIFEST 


/bash/NEWS 


/bash/NOTES 
/bash/POSIX 
/bash/RBASH 
/bash/README 
/bash/Y2K 
/bash/FAQ 
/bash/INTRO 
/bash/Makefile 
/bash/README 


/bash/aosa-bash 


/bash/aosa-bash-full 


/bash/article 
/bash/bash 
/bash/bashbug 


/bash/bashref 


Incompatibilities between versions of bash 
GNU General Public License (various versions) 
Basic installation instructions 

Master distribution manifest for bash 


A terse description of the new features added to 
bash 


Platform-specific configuration and operation notes 
Bash POSIX mode 

The restricted shell 

bash high-level README 

Y2K notice 

The Bash FAQ 

A short introduction to bash 

Makefile for the Bash/documentation directory 
bash documentation README 


Chapter 3: The Bourne-Again Shell from The 
Architecture of Open Source Applications, edited to 
trim length 


Chapter 3: The Bourne-A gain Shell from The 
Architecture of Open Source Applications 


An article Chet wrote about bash for The Linux Journal 
bash manpage 
bashbug manpage 


The Bash Reference Manual 
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/bash/bashref_toc Old Bash Reference Manual table of contents 


/bash/builtins builtins manpage, extracted from bash. 1 
/bash/fdl GNU Free Documentation License 
/bash/rbash bash restricted shell manpage 
/bash/readline GNU readline manpage 

/bash/rose94 Article: “Bash, the Bourne-Again Shell” 
/bash/version bash version info 

/INDEX/INDEX An index of bash examples (a subset of this) 


/complete/bash_completion Programmable completion functions 


/complete/bashcc-1.0.1.tar ClearCase completions from Richard Smith 


./complete/cdfunc An example completion function for cd 
./complete/complete Various completion files 

./complete/complete- Completion examples 

examples 

./complete/complete2 Various completion files from Ian Macdonald 
/functions/array-stuff Various array functions (ashift, array_sort, reverse) 


/functions/array-to-string Converts an array to a string 


/functions/autoload An almost ksh-compatible autoload 
/functions/basename A replacement for basename(1) 

/functions/basename2 Fast basename(1) and dirname(1) functions for bash/sh 
/functions/coproc Start, control, and end coprocesses 

/functions/coshell Control shell coprocesses (see coprocess.bash) 
/functions/csh-compat A C shell compatibility package 
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/functions/dirfuncs 


/functions/dirname 


/functions/dirstack 


/functions/emptydir 
/functions/exitstat 
/functions/external 
/functions/fact 


/functions/fstty 


/functions/func 


/functions/gethtml 


/functions/getoptx 


/functions/inetaddr 


/functions/inpath 


/functions/isnum 
/functions/isnum2 
/functions/isvalidip 
/functions/jdate 
/functions/jj 


/functions/keep 


Directory manipulation functions from the book The 
Korn Shell 


A replacement for dirname(1) 


Another implementation of the directory manipulation 
functions from the book The New KornShell Command 
and Programming Language 


Finds out if a directory is empty 

Displays the exit status of processes 

Like command but FORCES use of external command 
A recursive factorial function 


Frontend to sync TERM changes to both stty(/) and 
readline bind 


Prints out definitions for functions named by arguments 


Gets a web page from a remote server (wget(/) in 
bash!) 


getopt function that parses long-named options 


Performs internet address conversion (inet2hex & 
hex2inet) 


Returns zero if the argument is in the path and 
executable 


Tests user input on numeric or character values 

Tests user input on numeric values, with floating point 
Tests user input for valid IP addresses 

A function for Julian date conversion 

Looks for running jobs 


Tries to keep some programs in the foreground and 
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/functions/ksh-cd 


/functions/ksh-compat-test 


/functions/kshenv 


/functions/login 


/functions/lowercase 
/functions/manpage 


/functions/mhfold 


/functions/notify 


/functions/pathfuncs 


/functions/recurse 
/functions/repeat2 
/functions/repeat3 
/functions/seq 
/functions/seq2 
/functions/shcat 


/functions/shcat2 


/functions/sort-pos-params 


/functions/substr 


/functions/substr2 


running 
ksh-like cd: cd [-LP] [dir [change]] 


ksh-like arithmetic test replacements 


Functions and aliases to provide the beginnings of a ksh 


environment for bash 


Replaces the /ogin and newgrp builtins in old Bourne 
shells 


Renames files to lowercase 
Finds and prints a manual page 


Prints MH folders; useful only because folders(1) 
doesn’t print mod date/times 


Notifies when jobs change status 


Path-related functions (no_path, add_path, pre-path, 
del_path) 


A recursive directory traverser 

A clone of C shell builtin repeat 

A clone of C shell builtin repeat 

Generates a sequence from mto n; m defaults to 1 
Generates a sequence from mto n; m defaults to 1 
A readline-based pager 

A readline-based pager 

Sorts the positional parameters 

A function to emulate the ancient Ash builtin 


A function to emulate the ancient ksh builtin 
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/functions/term 


/functions/whatis 


/functions/whence 
/functions/which 
/functions/xalias 
/functions/xfind 
Moadables/Makefile 
/oadables/Makefile.inc 
oadables/README 
Moadables/basename 


/oadables/cat 


oadables/cut 
oadables/dirname 
oadables/finfo 
oadables/getconf 
oadables/head 
oadables/hello 
oadables/id 
loadables/In 
oadables/Noadables 


oadables/logname 


A shell function to set the terminal type interactively or 
not 


An implementation of the 10th Edition Unix sh builtin 
whatis(1) command 


An almost-ksh-compatible whence(1) command 

An emulation of which(1) as it appears in FreeBSD 
Converts csh alias commands to bash functions 

A find(1) clone 

Simple Makefile for the sample loadable builtins 
Sample Makefile for bash loadable builtin development 
README 

Returns the non-directory portion of a pathname 


cat(1) replacement with no options—the way cat was 
intended 


cut(1) replacement 

Returns the directory portion of a pathname 
Prints file info 

POSIX.2 getconf utility 

Copies the first part of a file 

Obligatory “Hello World"/sample loadable 
POSIX.2 user identity 

Makes links 

Includes files needed by all loadable builtins 


Prints login name of current user 
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oadables/mkdir 
oadables/mypid 
loadables/necho 
oadables/pathchk 
/oadables/print 
oadables/printenyv 
oadables/printf 
oadables/push 
/oadables/pushd 
oadables/realpath 
Moadables/rmdir 
oadables/setpgid 
/oadables/sleep 
oadables/sprintf 
oadables/strftime 


oadables/sync 


loadables/tee 
/oadables/template 
oadables/truefalse 
/oadables/tty 
oadables/uname 


/oadables/unlink 


Makes directories 

Adds $myYPID as a shell builtin 

echo without options or argument interpretation 
Checks pathnames for validity and portability 
Loadable ksh-93-style print builtin 

Minimal built-in clone of BSD printenv(1) 
Old printf 

Anyone remember TOPS-20? 

Old pushd 

Canonicalizes pathnames, resolving symlinks 
Removes directories 

bash loadable wrapper for setpgid system call 
Sleeps for fractions of a second 

Old sprintf 

Loadable built-in interface to strftime(3) 


Syncs the disks by forcing pending filesystem writes to 
complete 


Duplicates standard input 

Example template for loadable builtin 
True and false builtins 

Returns the terminal name 

Prints system information 


Removes a directory entry 
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/loadables/whoami 


Moadables/perl/Makefile 


oadables/perl/README 


oadables/perl/bperl 
oadables/perl/iperl 
/misc/aliasconv 


./misc/cshtobash 


/misc/suncmd 


/obashdb/PERMISSION 


/obashdb/README 


/obashdb/bashdb 


/scripts/adventure 
/scripts/bash-hexdump 
/scripts/bcsh 
/scripts/cat 
/scripts/center 
/scripts/dd-ex 


/scripts/fixfiles 


/scripts/hanoi 


/scripts/inpath 


Prints out username of current user 

Makefile for built-in Perl interpreter 

Illustrates how to build a Perl interpreter into bash 
perl builtin 

The Perl interpreter 

Converts csh aliases to bash aliases and functions 


Converts csh aliases, environment variables, and 
variables to bash equivalents 


SunView TERMCAP string 
Permission to use and distribute 


Deprecated sample implementation of a bash debugger; 
see http://bashdb.sourceforge.net/ instead 


Deprecated bashdb (bash shell debugger); see 
http://bashdb.sourceforge.net/ instead 


Text adventure game in bash! 

hexdump(1) in bash 

Bourne shell csh emulator 

readline-based pager 

Centers a group of lines 

Line editor using only /bin/sh, /bin/dd, and /bin/rm 


Recurses a tree and fixes files containing various “bad” 
characters 


The inevitable Towers of Hanoi in bash 
Searches $PATH for a file with the same name as $1; 


returns TRUE if found 
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/scripts/krand 


/scripts/line-input 


/scripts/nohup 


/scripts/precedence 


/scripts/randomcard 


/scripts/scrollbar 

/scripts/scrollbar2 
/scripts/self-repro 
/scripts/showperm 


/scripts/shprompt 


/scripts/spin 
/scripts/timeout 
/scripts/timeout2 
/scripts/timeout3 
/scripts/vtree2 
/scripts/vtree3 
/scripts/vtree3a 
/scripts/websrv 
/scripts/xterm_title 


/scripts/zprintf 


Produces a random number within integer limits 


Line input routine for GNU Bourne Again shell plus 
terminal-control primitives 


bash version of nohup command 

Tests relative precedences for && and | | operators 
Prints a random card from a card deck 

Displays scrolling text 

Displays scrolling text 

A self-reproducing script (careful!) 

Converts /s(/) symbolic permissions into octal mode 


Displays a prompt and gets an answer satisfying certain 
criteria 


Displays a spinning wheel to show progress 
Gives rsh(1) a shorter timeout 

Executes a given command with a timeout 
Executes a given command with a timeout 
Displays a tree printout of dir in 1k blocks 
Displays a graphical tree printout of a directory 
Displays a graphical tree printout of a directory 
A web server in bash! 

Prints the contents of the xterm title bar 


Emulates printf (obsolete since it’s now a bash builtin) 


/scripts.noah/PERMISSION Permissions to use the scripts in this directory 


/scripts.noah/README 


README 
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/scripts.noah/aref 
/scripts.noah/bash.sub 
/scripts.noah/bash_version 
/scripts.noah/meta 
/scripts.noah/mktmp 
/scripts.noah/number 
/scripts.noah/prompt 
/scripts.noah/remap_keys 
/scripts.noah/require 
/scripts.noah/send_mail 
/scripts.noah/shcat 
/scripts.noah/source 
/scripts.noah/string 
/scripts.noah/stty 
/scripts.noah/y_or_n_p 
/scripts.v2/PERMISSION 
/scripts.v2/README 
/scripts.v2/arc2tarz 


/scripts.v2/bashrand 


/scripts.v2/cal2day 
/scripts.v2/cdhist 


/scripts.v2/corename 


Pseudoarrays and substring indexing examples 
Library functions used by require.bash 

A function to slice up $BASH_VERSION 

Enables and disables eight-bit readline input 

Makes a temporary file with a unique name 

A fun hack to translate numerals into English 

A way to set $PS1 to some predefined strings 

A frontend to bind to redo readline bindings 
Lisp-like require/provide library functions for bash 
Replacement SMTP client written in bash 

bash replacement for cat(1) 

Replacement for source that uses current directory 
The string(3) functions at the shell level 

Frontend to stty(1) that changes readline bindings too 
Prompts for a yes/no/quit answer 

Permissions to use the scripts in this directory 
README 

Converts an “arc” archive to a compressed tar archive 


Random number generator with upper and lower 
bounds and optional seed 


Converts a day number to a name 
cd replacement with a directory stack added 


Tells what produced a core file 
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/scripts.v2/fman 


/scripts.v2/frcp 


/scripts.v2/lowercase 


/scripts.v2/ncp 
/scripts.v2/newext 
/scripts.v2/nmv 
/scripts.v2/pages 
/scripts.v2/pf 
/scripts.v2/pmtop 


/scripts.v2/ren 


/scripts.v2/rename 
/scripts.v2/repeat 
/scripts.v2/shprof 


/scripts.v2/untar 


/scripts.v2/uudec 
/scripts.v2/uuenc 
/scripts.v2/vtree 


/scripts.v2/where 


/startup-files/Bash_aliases 
/startup-files/Bash_profile 


/startup-files/Bashrc 


Fast man(1) replacement 


Copies files using fip(/) but with rcp-type command- 
line syntax 


Changes filenames to lowercase 

A nicer frontend for cp(1) (has -i, etc.) 
Changes the extension of a group of files 

A nicer frontend for mv(1) (has -i, etc.) 

Prints specified pages from files 

A pager frontend that handles compressed files 
Poor man’s top(/) for SunOS 4.x and BSD/OS 


Renames files by changing parts of filenames that 
match a pattern 


Changes the names of files that match a pattern 
Executes a command multiple times 
Line profiler for bash scripts 


Unarchives a (possibly compressed) tar archive into a 
directory 


Carefully uudecodes multiple files 

uudecodes multiple files 

Prints a visual display of a directory tree 

Shows where commands that match a pattern are 
Some useful aliases (Fox) 

Sample startup file for bash login shells (Fox) 


Sample Bourne Again shell init file (Fox) 
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/startup-files/README 
/startup-files/bash-profile 
/startup-files/bashrc 


/startup- 
files/apple/README 


/startup-files/apple/aliases 
/startup-files/apple/bash 


/startup- 
files/apple/environment 


/startup-files/apple/login 
/startup-files/apple/logout 
/startup-files/apple/rc 
./readline/ CHANGELOG 
./readline/CHANGES 
./readline/COPYING 
./readline/INSTALL 
./readline/MANIFEST 


./readline/ NEWS 


./readline/ README 


./readline/ USAGE 


./readline/Makefile 


./readline/fdl 


README 
Sample startup file for bash login shells (Ramey) 
Sample Bourne Again shell init file (Ramey) 


README 


Sample aliases for macOS 
Sample user preferences file 


Sample Bourne Again shell environment file 


Sample login wrapper 

Sample logout wrapper 

Sample Bourne Again shell config file 
readline-specific changelog 

DETAILED changes between versions 

GNU General Public License (various versions) 
Basic installation instructions 

Master distribution manifest for readline 


A terse description of the new features added to 
readline 


bash high-level README 


A note on legal use of readline through a shared-library 
linking mechanism 


Makefile for the readline library documentation 


GNU Free Documentation License 
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/readline/hist 
/readline/history 
/veadline/history_3 
/readline/history_toc 


/readline/hstech 


./readline/hsuser 


/readline/manvers 
/readline/readline 
/readline/readline_3 
/readline/readline_toc 
/readline/rlman 
/readline/rltech 
/readline/rluser 
/readline/rluserman 


./readline/version 


readline history (seems to be only RL4.3) 
GNU History library manpage 

GNU History library manpage 

Old GNU History library 


User interface to the GNU History library 
documentation 


User interface to the GNU History library 
documentation 


Manuscript version (seems to be only RL4.3) 
GNU readline manpage 

readline docs 

Old GNU readline library table of contents 
The GNU readline library API 

Programming with GNU readline 
Command-line editing 

GNU readline library user manual 


bash version info 
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Appendix C. Command-Line 
Processing 


Throughout the book we’ve seen a variety of ways in which the shell 
processes input lines, especially using read. We can think of this process as a 
subset of the things the shell does when processing command lines. This 
appendix provides a more detailed description of the steps involved in 
processing the command line and how you can get bash to make a second 
pass with eval. The material in this appendix also appears in Learning the 
bash Shell, 3rd Edition, by Cameron Newham (O’Reilly). 


Command-Line Processing Steps 


We’ve touched upon command-line processing throughout this book; we’ve 
mentioned how bash deals with single quotes (''), double quotes (""), and 
backslashes (\); how it separates characters on a line into words, even 
allowing you to specify the delimiter it uses via the environment variable 
SIFS; how it assigns the words to shell variables (e.g., $1, $2, etc); and how 
it can redirect input and output to/from files or other processes (pipelines). In 
order to be a real expert at shell scripting (or to debug some gnarly 
problems), you’ll need to understand the various steps involved in command- 
line processing—especially the order in which they occur. 


Each line that the shell reads from STDIN or from a script is called a pipeline 
because it contains one or more commands separated by zero or more pipe 
characters (|). Figure C-1 shows the steps in command-line processing. 
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split into tokens 


read next 


check 1st token 
syntax error 


opening keyword other keyword 


not keyword 


expanded alias 


check 1st token 


double quotes 


not alias 


O eo 
O mo 
O me 
O m 
O m 
O — oo 
O m O 


(11) command lookup: function, built-in command, 
executable file 


(2) 


single quotes 


make arguments into next command 


double quotes 


eval 
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Figure C-1. Steps in command-line processing 


For each pipeline it reads, the shell breaks it up into commands, sets up the 
I/O for the pipeline, then does the following for each command: 


l. 


Splits the command into tokens that are separated by the fixed set of 
metacharacters space, tab, newline, ;, (, ), <, >, |, and &. Types of 
tokens include words, keywords, I/O redirectors, and semicolons. 


. Checks the first token of each command to see if it is a keyword with 


no quotes or backslashes. If it’s an opening keyword such as if or 
another control-structure opener, function, {, or (, then the command 
is actually a compound command. The shell sets things up internally for 
the compound command, reads the next command, and starts the 
process again. If the keyword isn’t a compound command opener (e.g., 
it is a control-structure “middle” like then, else, or do; an “end” like 
fi or done; or a logical operator), the shell signals a syntax error. 


. Checks the first word of each command against the list of aliases. If a 


match is found, it substitutes the alias’s definition and goes back to step 
1; otherwise, it goes on to step 4. This scheme allows recursive aliases 
and allows for keywords to be defined (e.g., alias aslongas=whi Le or 
alias procedure=function). 


. Performs brace expansion. For example, a{b,c} becomes ab ac. 


. Substitutes the user’s home directory ($HOME) for tilde if it is at the 


beginning of a word. Substitutes the user’s home directory for ~user. 


. Performs parameter (variable) substitution for any expression that starts 


with a dollar sign ($). 


. Does command substitution for any expression of the form $( string). 


8. Evaluates arithmetic expressions of the form $(( string) ). 


. Takes the parts of the line that resulted from parameter, command, and 


arithmetic substitution and splits them into words again. This time it 
uses the characters in SIFS as delimiters instead of the set of 
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metacharacters in step 1. 


10. Performs pathname expansion, a.k.a. wildcard expansion, for any 
occurrences of *, ?, and [ ] pairs. 


11. Uses the first word as a command by looking up its source in the 
following order: as a function command, then as a builtin, then as a file 
in any of the directories in $PATH . 


12. Runs the command after setting up I/O redirection and other such 
things. 


That’s a lot of steps—and it’s not even the whole story! But before we go on, 
an example should make this process clearer. Assume that the following 
command has been run: 


alias ll="ls -l" 


Further assume that a file exists called .hist537 in user alice’s home 
directory, which is /home/alice, and that there is a double-dollar-sign variable 
$$ whose value is 2537 (remember $$ is the process ID, a number unique 
among all currently running processes). 


Now let’s see how the shell processes the following command: 
ll $(type -path cc) ~alice/.*$(($$%1000) 


Here is what happens to this line: 


1. LL $(type -pathcc) ~alice/.*$(($$%1000) ) splits the input into 
words. 


2. llis not a keyword, so step 2 does nothing. 


3. ls -l (type -path cc) ~alice/.*$(($$%1000) ) substitutes ls - 
l for its alias LL. The shell then repeats steps 1 through 3; step 2 splits 
the ls -l into two words. 


4. ls-l$(type -pathcc) ~alice/.*$(($$%1000) ) does nothing. 
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. ls -l $(type -path cc) /home/alice/.*$(($$%1000)) expands 


~alice into /home/alice. 


. Ls-l $(type-pathcc) /home/alice/.*$((2537%1000) ) substitutes 


2537 for $$. 


. ls-l /usr/bin/cc/home/alice/.*$((2537%1000) ) does command 


substitution on type -path cc. 


. ls -l /usr/bin/cc/home/alice/.*537 evaluates the arithmetic 


expression 2537%1000. 


. ls-l /usr/bin/cc/home/alice/.*537 does nothing. 
. ls -l /usr/bin/cc/home/alice/.hist537 substitutes the filename 


for the wild-card expression .*537. 


. The command Zs is found in /usr/bin. 


12. 


/usr/bin/ls is run with the option -l and the two arguments. 


Although this list of steps is fairly straightforward, it is not the whole story. 
There are still five ways to modify this process: quoting; using command, 
builtin, or enable; and using the advanced command eval. 


Quoting 


You can think of quoting as a way of getting the shell to skip some of the 12 
steps described earlier. In particular: 


= Single quotes ('') bypass everything from step 1 through step 10, 
including aliasing. All characters inside a pair of single quotes are 
untouched. You can’t have single quotes inside single quotes, even if you 
precede them with backslashes. 


= Double quotes ( 


) bypass steps 1 through 4, plus steps 9 and 10. That is, 


they ignore pipe characters, aliases, tilde substitution, wildcard expansion, 
and splitting into words via delimiters (e.g., blanks) inside the double 
quotes. Single quotes inside double quotes have no effect. But double 
quotes do allow parameter substitution, command substitution, and 
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arithmetic expression evaluation. You can include a double quote inside a 
double-quoted string by preceding it with a backslash (\). You must also 


backslash-escape $, ` (the archaic command substitution delimiter), and \ 
itself. 


Table C-1 has simple examples to show how these work; they assume the 


statement person=hatter was run and user alice’s home directory is 
/home/alice. 


Table C-1. Examples of 
using single and 
double quotes 


Expression Value 


Sperson hatter 
"Sperson" hatter 
\Sperson Sperson 
'Sperson' Sperson 
"'"Sperson'" ‘'hatter' 
~alice /home/alice 
"~alice" ~alice 
"~alice' ~alice 


If you are wondering whether to use single or double quotes in a particular 
shell programming situation, it is safest to use single quotes unless you 
specifically need parameter, command, or arithmetic substitution. 


eval 


We have seen that quoting lets you skip steps in command-line processing. 
Then there’s the eval command, which lets you go through the process again. 
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Performing command-line processing twice may seem strange, but it’s 
actually very powerful: it lets you write scripts that create command strings 
on the fly and then pass them to the shell for execution. This means that you 
can give scripts “intelligence” to modify their own behavior as they are 
running. 


The eval statement tells the shell to take eval’s arguments and run them 
through the command-line processing steps all over again. To help you 
understand the implications of eval, we’ll start with a trivial example and 
work our way up to a situation in which we’re constructing and running 
commands on the fly. 


eval Ls passes the string “ls” to the shell to execute; the shell prints a list of 
files in the current directory. This is very stmple—there is nothing about the 
string “Is” that needs to be sent through the command-processing steps twice. 
But consider this: 


listpage="ls | more" 
Slistpage 


Instead of producing a paginated file listing, the shell will treat | and more as 
arguments to /s, and /s will complain that no files of those names exist. Why? 
Because the pipe character appears as a pipe in step 6 when the shell 
evaluates the variable, which is after it has actually looked for pipe 
characters. The variable’s expansion isn’t even parsed until step 9. As a 
result, the shell will treat | and more as arguments to /s, so that /s will try to 
find files called | and more in the current directory! 


Now consider eval $Listpage instead of just $Listpage. When the shell 
gets to the last step, it will run the command eval with arguments Ls, |, and 
more. This causes the shell to go back to step 1 with a line that consists of 
these arguments. It finds | in step 2 and splits the line into two commands, /s 
and more. Each command is processed in the normal (and in both cases 
trivial) way. The result is a paginated list of the files in your current directory. 


Now you may start to see how powerful eval can be. It is an advanced feature 
that requires considerable programming cleverness to be used most 
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effectively. It even has a bit of the flavor of artificial intelligence, in that it 
enables you to write programs that can “write” and execute other programs. 
You probably won’t use eval for everyday shell programming, but it’s worth 
taking the time to understand what it can do. 
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Appendix D. Revision Control 


Revision control systems are a way not only to travel back in time, but to see 
what has changed at various points in your timeline. They are also called 
versioning or version control systems, which is actually a more technically 
accurate name. Such a system allows you to maintain a repository of files ina 
project, and to keep track of changes to those files, as well as the reasons for 
those changes. Modern revision control systems allow more than one 
developer to work concurrently on the same project, or even the same file. 


Revision control systems are essential to modern software development 
efforts, but they are also useful in many other areas, such as writing 
documentation, tracking system configurations (e.g., /etc/), and even writing 
books. We kept this edition of this book under revision control using Git 
while writing it; we used Subversion for the first edition. 


Some of the useful features of revision control systems include: 


= Making it very difficult to lose your work, especially when the repository 
is properly backed up. 

= Facilitating change control practices, and encourage documenting why a 
change is being made. 


= Allowing people in multiple locations to work together on a project, and 
to keep up with others’ changes, without losing data by saving on top of 
each other or sending lots of unreadable emails. 


= Allowing one person to work from multiple locations over time without 
losing work or stepping on changes made at other locations. 


= Allowing you to back out changes easily or to see exactly what has 
changed between one revision and another (except binary files). If you 
follow effective logging practices, they will even tell you why a change 
was made. 


Systems like CVS and Subversion also allow a form of keyword expansion 
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that lets you embed revision metadata in nonbinary files. 


There are many different free and commercial revision control systems, and if 
you are reading this book you should be using one! If you already know one, 
just use that. If your company has a standard tool, use that one. If neither of 
those help you choose, then use Git, Bazaar, or Mercurial. Do not use 
Subversion, CVS, RCS, or any of the older systems unless you have no 
choice. We’ll briefly cover pros, cons, and basic usage for Git, Bazaar, 
Mercurial, and Subversion in this appendix, all of which either come with or 
are available for every major modern operating system. But before that we 
need to give a bit of background. 


First, all the modern revision control systems are distributed, while older 
ones like Subversion and CVS are centralized. This is a major and 
fundamental difference, with some significant implications. In the older 
centralized systems, there is a central server, as the name implies, often 
maintained and backed up by your IT department, which is good. To do most 
useful things you need to connect to that server, which can be bad since that’s 
much slower than local disk access and may not be feasible, while traveling, 
for example. Also, in those systems you can check out only part of the 
repository (“repo”, and thus you often have one large repo for the entire 
company, and you just check out and work on the parts you need. These 
systems also do the keyword expansion we mentioned; we’ ll show that in the 
section on Subversion. Finally, to commit is also to publish, which may be 
considered either a feature or a bug in such systems, but is probably more 
likely at least undesirable, if not quite a bug. 


The distributed systems, on the other hand, do not have a central server, 
though often one copy is designated as the “source of truth” by convention. 
The repo you are working on is a complete copy, and it’s just as good as 
anyone else’s. That’s a major change from the ability to just check out part of 
a repo. These systems do not do any keyword expansion and a commit is not 
the same as a publish, which requires an additional push step and a network 
connection to the remote repo. But they’re local for all but push/pull 
operations and thus really fast. 
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WARNING 


With CVS or Subversion, you don’t have to think about backups. If the code is 
committed, it’s someplace else and probably backed up by IT (commit == 
publish). That is often not true with the modern distributed systems. They have 
local repos, so committed code stays local (commit != publish) until and unless 
you push (publish) it to a remote repo. If you never push you have only one 
local copy, so make sure you have good backups! A great tool for that is 
etckeeper, discussed later in this appendix. The repository is inside /etc/, so if 
you accidentally rm -rf /etc there goes the repo too. 


The second major point is that Git unquestionably won the war, and 
“everyone” uses it, everywhere. OK, not quite everyone, since if you are still 
reading this you probably don’t. But an awful lot of people do use it, and 
arguably a large part of the reason is GitHub. So why are we not jumping 
fully on that bandwagon? We’re glad you asked. 


If you are a full-time developer working on a large! project, you’re using Git 
already, and it’s awesome. But if you are a more casual user, say a sysadmin 
with a collection of scripts, Git can be less awesome.’ It is less actively user- 
hostile than it used to be, but it’s still very complicated to use, and we have 
seen no good mental model for how it works. Far too often, you have to 
really understand Git’s guts in order to use it for anything nontrivial, and 
that’s just ugly. Git is also made out of razor blades and chainsaws— 
blazingly fast, extremely powerful, but dangerous; you can hurt yourself with 
it. Git history, for example, is very malleable, and it considers this a feature, 
not a bug. It uses hashes and dates instead of human-readable revision 
numbers, and though there are good reasons for this it can be quite 
inconvenient. Finally, the Git “index” is different; none of the other common 
tools have this, but it does allow for a really handy trick where you can make 
stream-of-consciousness changes but later commit them in logical blocks 
using git add -porgit commit -p. We think that it’s a very powerful tool 
that’s not suitable for beginners or casual users. But...it’s everywhere and 
used by everyone, and that’s also a powerful argument. 


If you are interested in the history of revision control, see “Understanding 


836 


Version-Control Systems” by Eric Raymond for a lot of detail. To see an 
amazing example and just a really cool thing, check out the Unix History 
Repository. 


If you are going to start using revision control just by yourself, go jump in. 
But if you are going to start using it in a team, you must first decide: 

m Which system or product to use 

= The update, commit, tag, and branch polices 

= The location of the central (and well-backed-up!) repository, if applicable 
= The structure of the project or directories in the repository, if applicable 


This appendix is enough to get you started individually, but it barely 
scratches the surface; see Version Control with Git, 2nd Edition, by Jon 
Loeliger and Matthew McCullough or Version Control with Subversion, 2nd 
Edition, by C. Michael Pilato, Ben Collins-Sussman, and Brian Fitzpatrick, 
both from O’Reilly, for more in-depth introductions to revision control and 
complete details on the respective systems. Both have excellent treatments of 
the general concepts, although the Subversion book covers repository 
structure in more detail due to its potentially multiproject nature. Both also 
cover revision control policy. If your company has change control or related 
policies, use them. If not, we recommend you commit and update early and 
often. If you are working as a team, we strongly recommend reading some of 
the books listed in this appendix and carefully planning out a strategy. It will 
save vast amounts of time in the long run. 


See Also 

= “Understanding Version-Control Systems” by Eric Raymond 
= Unix History Repository 

a “A Visual Guide to Version Control” on BetterExplained 

= Backup & Recovery by W. Curtis Preston (O’Reilly) 


= reposurgeon, a tool for converting from one system to another 


837 


Git 
Git is the de facto leader in revision control and is probably used by more 


projects and more people than all the other systems combined. But if you 
choose to use it, be prepared to use Google. A lot. 


Git was originally written by Linus Torvalds for the Linux kernel project 
after the vendor of the previous system changed the licensing, but he very 
quickly turned it over to others. The design is heavily influenced by 
Torvalds’s years of experience on that massive and globally distributed 
project, and it is written by hardcore programmers for hardcore programmers. 
It is extremely powerful and flexible, but often quite complicated. The 
learning curve is unquestionably worth it for dedicated developers, but more 
casual or intermittent users may struggle. 


Pros 

=» Extremely popular and used everywhere. 

= Extremely fast, powerful, and flexible. 

m git add -pandgit commit -p account for how code is really written. 
= Has https://github.com/, https://gitlab.com/, etc. 

= History is very malleable. 


Cons 
=» You can perform operations that can cause data loss! 


= Harder to understand and use for more than very basic tasks than other 
tools. 


= Inconsistent and complex command-line use. 
= History is very malleable. 


= Uses hashes and dates instead of human-readable revision numbers. 


Example 
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This example is not suitable for enterprise or multiuser access (see the “See 
Also” section for links to more information. This is just the basics, but it will 
get you started and you can ramp up from here if you need to. 


If Git is not already installed, you should install it using the preferred 
package manager for your operating system. 


The git command (with no options, git help, and git help command all 
give you helpful hints and reminders. 


Configure Git on your machine (see ~/ gitconfig, and the .git/config that the 
init command will create: 


/home/jp$ git config --global user.name "JP Vossen" 
/home/jp$ git config --global user.email "jp@jpsdomain.org" 
/home/jp$ git config --global core.pager "less -R" 
/home/jp$ git config --global color.ui true 


NOTE 
If you do not set your name and email as shown here, you will probably get a 


message complaining about that. The message should be pretty clear about what 
to do. 


You might also consider: 


/home/jp$ git config --global alias.co checkout 
/home/jp$ git config --global alias.br branch 
/home/jp$ git config --global alias.ci commit 
/home/jp$ git config --global alias.st status 
/home/jp$ git config --global alias.last 'log -1 HEAD' 


Create a new repository for personal use in a home directory: 


/home/jp$ git init myrepo 
Initialized empty Git repository in /home/jp/myrepo/.git/ 


Create a new script and commit it: 
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/home/jp$ cd myrepo 


/home/jp/myrepoS cat << EOF > hello 
> #!/bin/bash - 

> echo ‘Hello World!' 

> EOF 


/home/jp/myrepo$S chmod +x hello 
/home/jp/myrepo$ git add hello 


/home/jp/myrepo$ git commit -m ‘Initial import of shell script' 
[master (root-commit) 62cb49e] Initial import of shell script 
1 file changed, 2 insertions(+) 

create mode 100755 hello 


WARNING 


git add is not the same as add in other tools! Once you add a file in the other 
tools, changes to that file are always committed. In Git, add means “add the 
changes I just made to the index.” So if you make a change, and add it, then 
make another change, that second change is not in the index and will not be 
committed unless you add the file again, or commit using - a. This sounds very 
annoying, and it is for basic use, but it’s part of the whole “index” concept that 
makes some other neat things possible, as we’ll see. 


NOTE 


If you do not use the -m message option an editor will pop up and you can 
create a commit log in that. Which editor will appear and how you change that 
will depend on your OS and distribution; consult the appropriate documentation 
if you wish to change the editor. 


Check the status of your sandbox: 


/home/jp/myrepo$ git status 
On branch master 
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nothing to commit, working directory clean 
Add a new script to revision control: 


/home/jp/scripts$ cat << EOF > mcd 
#!/bin/bash - 

mkdir -p "$1" 

cd "$i" 

EOF 


Vv 


vvyv 


/home/jp/myrepo$ chmod +x mcd 


/home/jp/myrepo$ git status 
On branch master 
Untracked files: 

(use "git add <file>... 


to include in what will be committed) 
mcd 


nothing added to commit but untracked files present (use "git add" to 
track) 


/home/jp/myrepo$ git add mcd 
/home/jp/myrepo$ git status 
On branch master 
Changes to be committed: 
(use "git reset HEAD <file>..." to unstage) 
new file: mcd 
/home/jp/myrepo$ git commit -m 'Added new script: mcd' 
[master a2c254d] Added new script: mcd 


1 file changed, 3 insertions(+) 
create mode 100755 mcd 


Make a change, then check the difference: 


/home/jp/myrepo$ vi hello 


/home/jp/myrepoS git diff 
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diff --git a/hello b/hello 
index 353223d..f36eea4 100644 
--- a/hello 

+++ b/hello 

@@ -1,2 +1,2 @@ 

#!/bin/bash - 

-echo 'Hello World!' 

+echo 'Hello Mom!' 
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/home/jp/myrepo$ git status 
On branch master 
Changes not staged for commit: 
(use "git add <file>..." to update what will be committed) 
(use "git checkout -- <file>..." to discard changes in working 
directory) 


modified: hello 


no changes added to commit (use "git add" and/or "git commit -a" 


TIP 


If you get a bunch of garbage escape characters on the screen when you run git 


diff, try setting git config --global core.pager "less -R". 


Commit the change using -a and thus avoiding git add hello: 


/home/jp/myrepo$ git commit -a -m 'Fine tuning' 
[master e1f0b2f] Fine tuning 
1 file changed, 1 insertion(+), 1 deletion(-) 


See the history of the repository or just one file: 


/home/jp/myrepo$ git log 
1 commit e1f0b2f8e5c489d8c9112014cf494773712786b0 
2 Author: JP Vossen <jp@jpsdomain.org> 
3 Date: Sun Jul 3 22:56:38 2016 -0400 
4 
5 Fine tuning 
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10 
11 
12 
13 
14 
15 
16 
17 


commit a2c254d61e95eb4719746f196b66019446061d51 
Author: JP Vossen <jp@jpsdomain.org> 
Date: Sun Jul 3 22:52:36 2016 -0400 


Added new script: mcd 
commit 62cb49ee962d929122051c421128fea95d571ebb 
Author: JP Vossen <jp@drake. jpsdomain.org> 


Date: Sun Jul 3 22:44:15 2016 -0400 


Initial import of shell script 


/home/jp/myrepo$ git log hello 


1 
2 
3 
4 
5 
6 
7 
8 
9 


10 
11 


commit e1f0b2f8e5c489d8c9112014cf494773712786b0 
Author: JP Vossen <jp@jpsdomain.org> 
Date: Sun Jul 3 22:56:38 2016 -0400 


Fine tuning 
commit 62cb49ee962d929122051c421128fea95d571ebb 
Author: JP Vossen <jp@drake. jpsdomain.org> 


Date: Sun Jul 3 22:44:15 2016 -0400 


Initial import of shell script 


Revert to the older version after all. There are other ways to do this, 
depending on what other changes you may have in your working directory, 
but this is simple if not intuitive: 


/home/jp/myrepo$ git checkout 62cb49ee962d929122051c421128fea95d571ebb 


hello 


/home/jp/myrepo$ cat hello 
#!/bin/bash - 
echo 'Hello World!' 


/home/jp/myrepo$ git status 
On branch master 


Changes 


(use ' 


to be committed: 


'git reset HEAD <file>..." to unstage) 


843 


modified: hello 


/home/jp/myrepo$ git diff 


But wait! We made a change, and status sees it but diff does not. Why? 
Because it already did a git add, so the change is staged or cached: 


/home/jp/myrepo$S git diff --cached 
1 diff --git a/hello b/hello 
index f36eea4..353223d 100755 
- a/hello 
+++ b/hello 
@@ -1,2 +1,2 QQ 
#!/bin/bash - 
-echo 'Hello Mom! ' 
+echo 'Hello World!' 


ONAN BW DN 
1 


We warned you... 


See Also 

m man git 

m git help 

= /Attps://github.com/features 

a /ttps://about.gitlab.com/ 

= /Attps://en.wikipedia.org/wiki/Git_(software) 

a /ttps://git-scm.com/ 

= http://xkcd.com/1597/ 

= Pro Git, 2nd Edition, by Scott Chacon and Ben Straub (Apress) 


= Version Control with Git, 2nd Edition, by Jon Loeliger and Matthew 
McCullough (O’Reilly) 


= “10 Things I Hate About Git” by Steve Bennett 
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= “Aha! Moments When Learning Git” on BetterExplained 
= The EasyGit wrapper 


= Recipe 16.16, “Creating and Changing Into a New Directory in One Step” 


Bazaar 


Bazaar was Canonical’ s answer to Git, but it lost the war and is basically in 
maintenace mode. 


Pros 
= Not Git. 
= Extremely user-friendly with awesome docs. 


= Cross-platform (Python) with several GUI tools: QBzr (Qt), Loggerhead 
(web), and others. 


m Uses incrementing integer revision numbers. 


= History is immutable. 


Has Launchpad. 


Cons 

= Not Git. 

= Lost the war and is not-quite-dead. 

= Not as fast as Git, but that almost never matters. 


= Not nearly as well known as Git. 


Example 


This example is not suitable for enterprise or multiuser access (see the “See 
Also” section for links to more information). This is just to show how easy 
the basics are. 
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If Bazaar is not already installed, you should install it using the preferred 
package manager for your operating system. 


The bzr command (with no options, bzr help, and bzr help command all 
give you helpful hints and reminders. 


Create a new repository for personal use in a home directory: 


/home/jp$ bzr init myrepo 
Created a standalone tree (format: 2a) 


Create a new script and commit it: 


/home/jp$ cd myrepo 


/home/jp/myrepoS cat << EOF > hello 
> #!/bin/bash - 

> echo 'Hello World!' 

> EOF 


/home/jp/myrepo$ chmod +x hello 


/home/jp/myrepo$ bzr add hello 
adding hello 


/home/jp/myrepo$ bzr commit -m ‘Initial import of shell script' 
Committing to: /home/jp/myrepo/ 

added hello 

Committed revision 1. 


NOTE 


If you do not use the -m message option an editor will pop up and you can 
create a commit log in that. Which editor will appear and how you change that 
will depend on your OS and distribution; consult the appropriate documentation 
if you wish to change the editor. 


Check the status of your sandbox: 
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/home/jp/myrepo$ bzr status 
Add a new script to revision control: 


/home/jp/scripts$ cat << EOF > mcd 
> #!/bin/bash - 

> mkdir -p "$1" 

> cd "$1" 

> EOF 


/home/jp/myrepo$ chmod +x mcd 


/home/jp/myrepo$ bzr status 
unknown: 
mcd 


/home/jp/myrepo$ bzr add mcd 
adding mcd 


/home/jp/myrepo$ bzr status 
added: 
mcd 


/home/jp/myrepo$ bzr commit -m 'Added new script: mcd' 
Committing to: /home/jp/myrepo/ 

added mcd 

Committed revision 2. 


Make a change, then check the difference: 


/home/jp/myrepo$ vi hello 


/home/jp/myrepo$S bzr diff 

=== modified file 'hello' 

--- hello 2016-07-04 03:26:32 +0000 
+++ hello 2016-07-04 03:28:11 +0000 
@@ -1,2 +1,2 @@ 

#!/bin/bash - 

-echo ‘Hello World!' 

+echo 'Hello Mom!' 
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/home/jp/myrepo$ bzr status 
modified: 
hello 


Commit the change: 


/home/jp/myrepo$ bzr commit -m 'Fine tuning' 
Committing to: /home/jp/myrepo/ 

modified hello 

Committed revision 3. 


See the history of the repository or just one file: 


/home/jp/myrepo$ bzr log 
revno: 3 
committer: JP Vossen <jp@ringo. jpsdomain.org> 
branch nick: myrepo 
timestamp: Sun 2016-07-03 23:28:48 -0400 
message: 

Fine tuning 
revno: 2 
committer: JP Vossen <jp@ringo. jpsdomain.org> 
branch nick: myrepo 
timestamp: Sun 2016-07-03 23:27:50 -0400 
message: 

Added new script: mcd 
revno: 1 
committer: JP Vossen <jp@ringo. jpsdomain.org> 
branch nick: myrepo 
timestamp: Sun 2016-07-03 23:26:32 -0400 
message: 

Initial import of shell script 


/home/jp/myrepo$ bzr log hello 


revno: 3 
committer: JP Vossen <jp@ringo. jpsdomain.org> 
branch nick: myrepo 
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timestamp: Sun 2016-07-03 23:28:48 -0400 
message: 

Fine tuning 
revno: 1 
committer: JP Vossen <jp@ringo. jpsdomain.org> 
branch nick: myrepo 
timestamp: Sun 2016-07-03 23:26:32 -0400 
message: 

Initial import of shell script 


Revert to the older version after all: 


/home/jp/myrepo$ bzr revert -r1 hello 
M hello 


/home/jp/myrepo$ bzr status 
modified: 
hello 


/home/jp/myrepoS bzr diff 

=== modified file 'hello' 

--- hello 2016-07-04 03:28:48 +0000 
+++ hello 2016-07-04 03:29:44 +0000 
@@ -1,2 +1,2 @@ 

#!/bin/bash - 

-echo ‘Hello Mom!' 

+echo ‘Hello World!' 


See Also 


m man bzr 
m bzr help 
€u https:/en.wikipedia.org/wiki/Bazaar_(software) 


http://wiki.bazaar.canonical.com/Documentation 


http://wiki. bazaar.canonical.com/Workflows 


Bazaar Version Control by Janos Gyerik (Packt) 
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= Recipe 16.16, “Creating and Changing Into a New Directory in One Step” 


Mercurial 


Mercurial was started at the same time as Git for the same reason, but never 
caught on quite as much. 


Pros 

=» Not Git. 

= Extremely user-friendly with good docs. 

= Cross-platform (Python) with several GUI tools. 
— Built-in web server (hg serve then hAttp://localhost:8000/). 

= Uses incrementing integer revision numbers + a hex ID. 
— The hex ID is unique and consistent across all repo clones, the integer 
isn’t. 

= History is immutable. 


= Has Atlassian hitps://bitbucket.org/. 


Cons 

= Not Git. 

= Lost to Git but more active than Bazaar. 
= Not as well known as Git. 


=» Not as fast as Git, but that almost never matters. 


Example 


This example is not suitable for enterprise or multiuser access (see the “See 
Also” section for links to more information). This is just to show how easy 
the basics are. 
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If Mercurial is not already installed, you should install it using the preferred 
package manager for your operating system. 


hg command (with no options, hg help, and hg help command all give you 
helpful hints and reminders.. 


Create a new repository for personal use in a home directory: 
/home/jp$ hg init myrepo 
Create a new script and commit it: 


/home/jp$ cd /myrepo 
/home/jp/myrepo$S cat << EOF > hello 
> #!/bin/bash - 

> echo ‘Hello World!' 

> EOF 

/home/jp/myrepo$ chmod +x hello 
/home/jp/myrepo$S hg add hello 


/home/jp/myrepo$S hg commit -m ‘Initial import of shell script' 


NOTE 
If you do not use the -m message option an editor will pop up and you can 
create a commit log in that. Which editor will appear and how you change that 
will depend on your OS and distribution; consult the appropriate documentation 
if you wish to change the editor. 
Check the status of your sandbox: 


/home/jp/myrepo$ hg status 


Add a new script to revision control: 


851 


/home/jp/scripts$ cat << EOF > mcd 
> #!/bin/bash - 

> mkdir -p "$1" 

> cd "$1" 

> EOF 


/home/jp/myrepo$ chmod +x mcd 


/home/jp/myrepo$ hg status 
? mcd 


/home/jp/myrepo$ hg add mcd 


/home/jp/myrepo$ hg status 
A mcd 


/home/jp/myrepo$ hg commit -m ‘Added new script: mcd' 
Make a change, then check the difference: 


/home/jp/myrepo$ vi hello 


/home/jp/myreposS hg diff 

diff -r 663ba0ec20f5 hello 

--- a/hello Sun Jul 03 23:38:54 2016 -0400 
+++ b/hello Sun Jul 03 23:39:15 2016 -0400 
@@ -1,2 +1,2 @@ 

#!/bin/bash - 

-echo 'Hello World!' 

+echo 'Hello Mom!' 


/home/jp/myrepo$ hg status 
M hello 


Commit the change: 
/home/jp/myrepo$ hg commit -m 'Fine tuning' 


See the history of the repository or just one file: 
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/home/jp/myrepoS hg log 
changeset: 2:c88abQcbfcda 


tag: tip 

user: JP Vossen <jp@jpsdomain.org> 
date: Sun Jul 03 23:39:38 2016 -0400 
summary: Fine tuning 


changeset: 1:663baQec20f5 


user: JP Vossen <jp@jpsdomain.org> 
date: Sun Jul 03 23:38:54 2016 -0400 
summary: Added new script: mcd 


changeset: 0:38ab693c1c72 


user: JP Vossen <jp@jpsdomain.org> 
date: Sun Jul 03 23:38:03 2016 -0400 
summary: Initial import of shell script 


/home/jp/myrepo$S hg log hello 
changeset: 2:c88abQcbfcda 


tag: tip 

user: JP Vossen <jp@jpsdomain.org> 
date: Sun Jul 03 23:39:38 2016 -0400 
summary: Fine tuning 


changeset: 0:38ab693c1c72 


user: JP Vossen <jp@jpsdomain.org> 
date: Sun Jul 03 23:38:03 2016 -0400 
summary: Initial import of shell script 


Revert to the older version after all: 


/home/jp/myrepo$ hg revert -r 1 hello 
/home/jp/myrepo$ cat hello 
#!/bin/bash - 

echo 'Hello World!' 


/home/jp/myrepo$ hg status 
M hello 


See Also 
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man hg 

hg help 

https://en. wikipedia. org/wiki/Mercurial 
https://www.mercurial-scm.org/ 
https://www.mercurial-scm.org/guide 
Book: http://hgbook.red-bean.com/ 


https://betterexplained.com/articles/intro-to-distributed-version-control- 
illustrated/ 


Recipe 16.16, “Creating and Changing Into a New Directory in One Step” 


Subversion 


According to the Subversion web site, “The goal of the Subversion project is 
to build a version control system that is a compelling replacement for CVS in 
the open source community.” Enough said. 


Pros 


Not Git. 
Newer than CVS and RCS. 


Simpler and arguably easier to understand and use than CVS (less 
historical baggage). 


Atomic commits means the commit either fails or succeeds as a whole, 
and makes it easy to track the state of an entire project as a single revision. 


Easy to access remote repositories. 
Allows easy renaming of files and directories while retaining history. 


Easily handles binary files (no native diff support) and other objects such 
as symbolic links. 


Central repository hacking is more officially supported, but less trivial. 
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Cons 
= Not Git. 
m Older technology, revision control has moved to the distributed model. 


= Can be complicated to build or install from scratch due to many 
dependencies. Use the version that came with your operating system if 
possible. 


TIP 


SVN tracks revisions by repository, which means that each commit has its own 
internal SVN revision number. Thus consecutive commits by a single person 
may not have consecutive revision numbers since the global repository revision 
is incremented as other changes (possibly to other projects) are committed by 
other people. 


Example 


This example is not suitable for enterprise or multiuser access (see the “See 
Also” section for links to more information). This is just to show how easy 
the basics are. This example also has the EDITOR environment variable set to 
nano (export EDITOR='nano --smooth --const --nowrap -- 
suspend' ), which some people find more user-friendly than the default vi. 


The svn help and svn help help commands are very useful. 


Create a new repository for personal use in a home directory: 


/home/jp$ svnadmin --fs-type=fsfs create /home/jp/svnroot 


Create a new project and import it: 


/home/jp$ cd /tmp 


/tmpS mkdir -p -m 0700 scripts/trunk scripts/tags scripts/branches 
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/tmpS cd scripts/trunk 

/tmp/scripts/trunk$ cat << EOF > hello 

> #!/bin/sh 

> echo ‘Hello World!' 

> EOF 

/tmp/scripts/trunk$ cd .. 

/tmp/scripts$ svn import /tmp/scripts file:///home/jp/svnroot/scripts 


GNU nano 1.2.4 File: svn-commit. tmp 


Initial import of shell scripts 
--This line, and those below, will be ignored- - 


A 
[ Wrote 4 lines ] 
Adding /tmp/scripts/trunk 
Adding /tmp/scripts/trunk/hello 
Adding /tmp/scripts/branches 
Adding /tmp/scripts/tags 


Committed revision 1. 
Check out the project and update it: 


/tmp/scriptss cd 


/home/jp$ svn checkout file: ///home/jp/svnroot/scripts 
A scripts/trunk 

A scripts/trunk/hello 

A scripts/branches 

A scripts/tags 

Checked out revision 1. 


/home/jp$ cd scripts 
/home/jp/scripts$ ls -l 


total 12K 
drwxr-xr-x 3 jp jp 4.0K Jul 20 01:12 branches/ 
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drwxr-xr-x 3 
drwxr-xr-x 3 


jp jp 4.0K Jul 20 01:12 tags/ 
jp jp 4.0K Jul 20 01:12 trunk/ 
/home/jp/scripts$ cd trunk/ 


/home/jp/scripts/trunk$ ls -l 
total 4.0K 
-rw-r--r-- 1 jp jp 30 Jul 20 01:12 hello 


/home/jp/scripts/trunk$ echo "Hi Mom..." >> hello 


Check the status of your sandbox. Note how the svn status command is 
similar to our cvs -qn update hack in the “CVS” section earlier in this 
appendix: 


/home/jp/scripts/trunk$ svn info 

Path: 

URL: file:///home/jp/svnroot/scripts/trunk 

Repository UUID: 29eeb329-fc18-0410-967e-b075d748cc20 

Revision: 1 

Node Kind: directory 

Schedule: normal 

Last Changed Author: jp 

Last Changed Rev: 1 

Last Changed Date: 2006-07-20 01:04:56 -0400 (Thu, 20 Jul 2006) 


/home/jp/scripts/trunk$ svn status -v 
1 1 jp ; 
M 1 1 jp hello 


/home/jp/scripts/trunk$ svn status 
M hello 


/home/jp/scripts/trunk$ svn update 
At revision 1. 


Add a new script to revision control: 


/home/jp/scripts/trunk$ cat << EOF > mcd 
> #!/bin/sh 
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> mkdir -p "$1" 
> cd meq! 
> EOF 


/home/jp/scripts/trunk$ svn st 


? mcd 

M hello 
/home/jp/scripts/trunk$ svn add mcd 
A mcd 


Commit changes: 


/home/jp/scripts/trunk$ svn ci 


GNU nano 1.2.4 File: svn-commit.tmp* Tweaked hello 
* Added mcd 
--This line, and those below, will be ignored- - 


M trunk/hello 
A trunk/mcd 
[ Wrote 6 lines ] 
Sending trunk/hello 
Adding trunk/mcd 


Transmitting file data .. 
Committed revision 2. 


Update the sandbox, make another change, then check the difference: 


/home/jp/scripts/trunk$ svn up 
At revision 2. 


/home/jp/scripts/trunk$ vi hello 


/home/jp/scripts/trunk$S svn diff hello 
Index: hello 


--- hello (revision 2) 
+++ hello (working copy) 
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@@ -1,3 +1,3 @@ 

#! /bin/sh 

echo 'Hello World!' 
-Hi Mom... 
+echo 'Hi Mom...' 


Commit the change, avoiding the editor by putting the log entry on the 
command line: 


/home/jp/scripts/trunk$ svn -m 'Fine tuning' commit 
Sending trunk/hello 

Transmitting file data . 

Committed revision 3. 


See the history of the file: 


/home/jp/scripts/trunk$ svn log hello 


r3 | jp | 2006-07-20 01:23:35 -0400 (Thu, 20 Jul 2006) | 1 LineFine 
tuning 


r2 | jp | 2006-07-20 01:20:09 -0400 (Thu, 20 Jul 2006) | 3 lines 
* Tweaked hello 
* Added mcd 


r1 | jp | 2006-07-20 01:04:56 -0400 (Thu, 20 Jul 2006) | 2 lines 


Initial import of shell scripts 


Add some revision metadata, and tell the system to expand it. Commit it and 
examine the change: 


/home/jp/scripts/trunk$ vi hello 


/home/jp/scripts/trunk$ cat hello 
#!/bin/sh 
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# $Id$ 
echo ‘Hello World!' 
echo 'Hi Mom...' 


home/jp/scripts/trunk$ svn propset svn:keywords "Id" hello 
property 'svn:keywords' set on ‘'hello' 


/home/jp/scripts/trunk$ svn ct -m'* Added ID keyword' hello 
Sending hello 


Committed revision 4. 


/home/jp/scripts/trunk$ cat hello 

#!/bin/sh 

# $Id: hello 5 2006-07-21 09:09:34Z jp $</code></strong> 
echo 'Hello World!' 

echo 'Hi Mom...' 


Compare the current revision to r2, revert to that older (broken) revision, 
realize we goofed and get the most recent revision back: 


/home/jp/scripts/trunk$ svn diff -r2 hello 
Index: hello 


--- hello (revision 2) 
+++ hello (working copy) 
@@ -1,3 +1,4 @@ 

#! /bin/sh 

+# $Id$ 

echo 'Hello World!' 

-Hi Mom... 

+echo 'Hi Mom...' 


Property changes on: hello 


Name: svn:keywords 
+ Id 


/home/jp/scripts/trunk$ svn update -r2 hello 


UU hello 
Updated to revision 2. 
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/home/jp/scripts/trunk$ cat hello 
#!/bin/sh 

echo 'Hello World!' 

Hi Mom... 


/home/jp/scripts/trunk$ svn update -rHEAD hello 
UU hello 
Updated to revision 4. 


/home/jp/scripts/trunk$ cat hello 
#!/bin/sh 

# $Id: hello 5 2006-07-21 09:09:34Z jp $ 
echo 'Hello World!' 

echo 'Hi Mom...' 


See Also 


m man svn 
= man svnadmin 

= man svndumpfilter 

m man svnLlook 

m man svnserve 

m Man svnversion 

= The Subversion website 

= TortoiseSVN, a simple SVN frontend for Explorer (cool!) 


= Version Control with Subversion by C. Michael Pilato, Ben Collins- 
Sussman, and Brian Fitzpatrick 


— “Appendix B: Subversion for CVS Users” 
= The FreeBSD guide to using Subversion 
m SVN static builds for Solaris, Linux, and macOS 
= Better SCM Initiative version control system comparison 


a “A Visual Guide to Version Control” on BetterExplained 
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= Recipe 16.16, “Creating and Changing Into a New Directory in One Step” 


Meld 


Meld is not a revision control tool itself; it is a very useful graphical diff and 
merge tool that can work with revision control systems. When run normally, 
it allows you to compare and merge files and directories. When run from a 
revision control sandbox, it will compare the working copy to the version 
under revision control and show you what you’ve changed. Trust us, it’s 
awesome. 


Pros 

= Cross-platform (Python) 

= Available for all or most Linux distributions 
m Windows installer 


= Unofficial Mac installers 


Cons 


=» None 


Example 
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meld-tmpQ _6KP]: hello - Meld 
File Edit Changes View Tabs Help 


4] 
(=j meld-tmpQ_6KP] : hello x 
|&) | tmp/meld-tmpQ_6KPJ v | | Browse... | /home/jp/myrepo/hello v | Browse... & | 
1 #! /bin/bash. - l 1 #!/bin/bash. - 
2 echo. ‘Hello.World! ' > _ X 2 echo. ‘Hello-Mom!' . 
3 3 
Ln2,Col1 INS 
Figure D-1. Meld in action 
See Also 
m man meld 


= /Attp://meldmerge.org/ 
a /ttps://en.wikipedia.org/wiki/Meld_(software) 


etckeeper 


etckeeper is not a revision control tool itself, but it uses one to put your /etc/ 
directory under revision control. It’s available in all or most Linux 
distributions, and it hooks into cron to do daily commits and the package 
manager to do commits before and after package operations. It also works 


around the issues of files appearing and disappearing, ownership, permissions 
and such that revision control systems usually don’t handle all by themselves. 
It uses the underlying tool’s “ignore” file to ignore files that change too often 
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or are otherwise not useful to revision. 


Out of the box, Meld creates a repository in /etc/ and starts committing. You 
can configure the underlying revision control system to push to a remote 
repository as a backup as well. 


Which revision control system it uses varies by distribution, and it’s 
configurable as well. 


Here are a few tips, if you’re thinking about using etckeeper: 


= There are security implications to storing the /etc/shadow file in etckeeper. 
See the README for details. 


= You will need to install the Extra Packages for Enterprise Linux (EPEL) 
repository for Red Hat Enterprise, CentOS, and similar RPM distros. 


m etckeeper will not initialize or commit for you, like Debian does. After 
installing the RPM, you will need to run sudo etckeeper init and sudo 
etckeeper commit First commit before it will start working for you. 


Pros 


= Set-it-and-forget-it revision control for /etc/ 


Cons 
m See the potential security implication. 


= The out-of-the-box configuration is local only. 


Example 
Here’s an example install on a mostly stock Debian (Jessie) system: 
[ jp@jessie:T0:L1:C19:J0:2016-07-04_ 15:47:25 EDT] 


/home/jp$ sudo apt-get update 
[sudo] password for jp: 


Fetched 7,652 B in 4s (1,796 B/s) 
Reading package lists... Done 
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[ jp@jessie:T0:L1:C20:J0:2016-07-04 15:47:50 EDT] 
/home/jp$ sudo apt-get install etckeeper 
Reading package lists... Done 
Building dependency tree 
Reading state information... Done 
The following extra packages will be installed: 
git git-man liberror-perl 
Suggested packages: 
git-daemon-run git-daemon-sysvinit git-doc git-el git-email git-gui 
gitk gitweb 
git-arch git-cvs 
git-mediawiki git-svn 
The following NEW packages will be installed: 
etckeeper git git-man lLiberror-perl 
0 upgraded, 4 newly installed, © to remove and © not upgraded. 
Need to get 4,587 kB of archives. 
After this operation, 23.7 MB of additional disk space will be used. 
Do you want to continue? [Y/n] y 


Setting up etckeeper (1.15) ... 

Initialized empty Git repository in /etc/.git/ 
[master (root-commit) 6d597ca] Initial commit 
Author: jp <jp@jessie. jpsdomain.org> 

1324 files changed, 32995 insertions(+) 
create mode 100755 .etckeeper 

create mode 100644 .gitignore 


create mode 100644 xml/catalog 
create mode 100644 xml/docutils-common.xml 
create mode 100644 xml/xml-core. xml 


/home/jp$ cd /etc 
/etc$ sudo git status 


On branch master 
nothing to commit, working directory clean 


/etcS cat /etc/cron.daily/etckeeper 
#!/bin/sh 
set -e 
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if [ -x /usr/bin/etckeeper ] && [ -e /etc/etckeeper/etckeeper.conf ]; 
then 
. Jetc/etckeeper/etckeeper.conf 
if [ "SAVOID_DAILY_AUTOCOMMITS" != "1" ]; then 
# avoid autocommit if an install run is in progress 
lockfile=/var/cache/etckeeper/packagelist.pre-install 
if [ -e "Slockfile" ] && [ -n "S$(find "Slockfile" - 
mtime +1)" ] 
then 
rm -f "Slockfile" # stale 
fi 
if [ ! -e "Slockfile" ]; then 
AVOID_SPECIAL_FILE_WARNING=1 
export AVOID_SPECIAL_FILE_WARNING 
if etckeeper unclean; then 
etckeeper commit "daily autocommit" 
>/dev/null 
fi 
fi 
fi 
fi 


Now etckeeper will commit daily and before and after package operations. 
But you can commit manually as well, and you can use all of the features of 
the underlying revision control system: 


/etc$ sudo useradd carl 


/etc$ sudo git status 
On branch master 
Changes not staged for commit: 
(use "git add <file>..." to update what will be committed) 


(use "git checkout -- <file>..." to discard changes in working 
directory) 

modified: group 

modified: group- 

modified:  gshadow 

modified:  gshadow- 

modified: passwd 

modified: shadow 
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modified: subgid 
modified: subgid- 
modified: subuid 
modified: subuid- 


no changes added to commit (use "git add" and/or "git commit -a" 


/etc$ sudo etckeeper commit 'Added a user for Carl' 
[master 8b58601] Added a user for Carl 

Author: jp <jp@jessie. jpsdomain.org> 

11 files changed, 12 insertions(+), 2 deletions(-) 


/etc$ sudo git status 
On branch master 
nothing to commit, working directory clean 


See Also 


= etckeeper 
m man etckeeper 
= http://etckeeper.branchable.com/ 
— http://etckeeper.branchable.com/README/ 


Other 


Finally, it is worth noting that some word processors, such as LibreOffice 
Writer and Microsoft Word, have three relevant features: document 
comparison, change tracking, and versions. 


Document Comparison 


Document comparison allows you to compare documents when their native 
file format makes use of other diff tools difficult. You would use this when 
you have two copies of a document that didn’t have change tracking turned 
on, or when you need to merge feedback from various sources. 
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While it is trivial to unzip the content.xml file from a given ODF file, the 
result has no line breaks and is not terribly pretty or readable. See Recipe 
12.5 for a bash script that will do this low-level kind of difference. 


Refer to Table D-1 at the end of this section for information on how to access 
the built-in GUI comparison function, which is much easier than trying to do 
it manually. 


Change Tracking and Versions 


The change-tracking feature saves information about changes made to a 
document. Review mode uses various copyediting markup on the screen to 
display who did what, when. This is obviously useful for all kinds of creation 
and editing purposes, but please read our warning. 


The versions feature allows you to save more than one version of a document 
in a single file. This can be handy in all sorts of odd ways. For example, 

we’ ve seen router configurations copied and pasted from a terminal into 
different versions inside the same document for archival and change control 
purposes. 


WARNING | 
The change-tracking and versions features will cause your document to 


continually grow in size, since items that are changed are still kept and deleted 
items are not really deleted, but only marked as deleted. 


Also, if accidentally turned on, change tracking and versions can be very 
dangerous information leaks! For example, if you send similar proposals to 
competing companies after doing a search and replace and other editing, 
someone at one of those companies can see exactly what you changed and when | 
you changed it. The most recent versions of these tools have various methods 
that attempt to warn you or clear private information before a given document is 
converted to PDF or emailed, but take a look at any word processor attachments 
you receive in email, especially from vendors. You may be surprised. 


Accessing These Features 


868 


Table D-1 shows where to find the features described here in LibreOffice 
Writer and Microsoft Word. 


Table D-1. Word processor functions 


Feature Writer menu option Word menu option 
Document Edit — Compare Tools — Compare and Merge 
comparisons Document Documents 

Change tracking Edit — Changes Tools — Track Changes 
Versions File — Versions File — Versions 


1 We should say “web-scale,” to be buzzword-compliant. 


2 See also “10 Things I Hate About Git” by Steve Bennett. 
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Appendix E. Building bash from 
source 


In this appendix we’ll show you how to get the latest version of bash and 
install it on your system from source, and we’ll discuss potential problems 
you might encounter along the way. We’ll also look briefly at the examples 
that come with bash and how you can report bugs to the bash maintainer. The 
material in this appendix also appears in Learning the bash Shell, 3rd Edition, 
by Cameron Newham (O’Reilly). 


Obtaining bash 


You can find the very latest details on the current distribution and where to 
obtain it from the bash home page. 


Unpacking the Archive 


Having obtained the archive file, you need to unpack it and install it on your 
system. Unpacking can be done anywhere—we’ll assume you’re unpacking it 
in your home directory. Installing it on the system requires you to have root 
privileges. If you aren’t a system administrator with root access, you can still 
compile and use bash; you just can’t install it as a system-wide utility. The 
first thing to do is uncompress the archive file: gunzip bash-4.4.tar.gz. 
Then you need to untar the archive: tar -xf bash- 4.4.tar. The -xf 
means “extract the archived material from the specified file.” This will create 
a directory called bash-4.4 in your home directory. If you do not have the 
gunzip utility, you can obtain it in the same way you obtained bash or simply 
use gzip -d instead. 


The archive contains all of the source code needed to compile bash and a 
large amount of documentation and examples. We’ll look at these things and 
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how you go about making a bash executable in the rest of this appendix. 


What’ s in the Archive 


The bash archive contains a main directory (bash-4.4 for the current version) 
and a set of files and subdirectories. Among the first files you should 
examine are: 


CHANGES 


A comprehensive list of bug fixes and new features since the last version 


COPYING 
The GNU copyleft for bash 


MANIFEST 


A list of all the files and directories in the archive 


NEWS 


A list of new features since the last version 


README 

A short introduction and instructions for compiling bash 
You should also be aware of two directories: 
doc 


Information related to bash in various formats 


examples 
Examples of startup files, scripts, and functions 


The other files and directories in the archive are mostly things that are needed 
during the build. Unless you are going to go hacking into the internal 
workings of the shell, they shouldn’t concern you; if you’re interested in 
seeing the full list, however, check out Appendix B. 
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Documentation 


The doc directory contains a few articles that are worth reading. Indeed, it 
would be well worth printing out the manual entry for bash so you can use it 
in conjunction with this book. The README file gives a short summary of 
the files. 


The document you’ ll most often use is the manpage entry bash.0. This 
summarizes all of the facilities your version of bash has and is the most up- 
to-date reference you can get. This document is also available through the 
man facility once you’ve installed the package. 


Of the other documents, FAQ is a Frequently Asked Questions document 
with answers, readline.3 is the manual entry for the readline facility, and 
article.ms is an article about the shell that appeared in Linux Journal and was 
written by the current bash maintainer, Chet Ramey. 


Configuring and Building bash 


Compiling bash “straight out of the box” is easy—you just type 

. [configure and then make! The configure script attempts to work out 
whether you have various utilities and C library functions, and their locations 
on your system. It then stores the relevant information in the file config.h. It 
also creates a file called config.status, which is a script you can run to 
recreate the current configuration information. While configure is running, it 
prints out information on what it is searching for and where it finds it. 


The configure script also sets the location where bash will be installed; the 
default is the /usr/local area (/usr/local/bin for the executable, /usr/local/man 
for the manual entries, etc). If you don’t have root privileges and want it in 
your own home directory, or you wish to install bash in some other location, 
you'll need to provide configure with the path you want to use. You can do 
this with the - -exec- prefix option. For example: 


configure --exec-prefix=/usr 


specifies that the bash files will be placed under the /usr directory. Note that 
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configure prefers option arguments be given with an equals sign (=. 


After the configuration finishes and you type make, the bash executable is 
built. A script called bashbug is also generated, which allows you to report 
bugs in the format the bash maintainers want. We’ll look at how to use it 
later in this appendix. 


Once the build finishes, you can see if the bash executable works by typing 
./bash. 


To install bash, type make install. This will create all of the necessary 
directories (bin, info, man and its subdirectories) and copy the files to them. 


If you’ve installed bash in your home directory, be sure to add your own bin 
path to your $PATH and your own man path to SMANPATH. 


bash comes preconfigured with nearly all of its features enabled, but it is 
possible to customize your version by specifying what you want with the -- 
enable feature and --disable feature command-line options to configure. 
See the INSTALL file for more details on the configurable features and what 
they do. 


Many other shell features can be turned on or off by modifying the file 
config-top.h. For further details on this file and on configuring bash in 
general, see INSTALL. 


Finally, to clean up the source directory and remove all of the object files and 
executables, type make clean. Make sure you’ve run make install first; 
otherwise, you'll have to rerun the installation from scratch. 


Testing bash 


There are a series of tests that can be run on your newly built version of bash 
to see if it is running correctly. The tests are scripts that are derived from 
problems reported in earlier versions of the shell. Running these tests on the 
latest version of bash shouldn’t cause any errors. 


To run the tests, just type make tests in the main bash directory. The name 
of each test is displayed, along with some warning messages, and then it is 
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run. Successful tests produce no output (unless otherwise noted in the 
warning messages. 


If any of the tests fail, you’ll see a list of things that represent differences 
between what is expected and what happened. If this occurs, you should file a 
bug report with the bash maintainer; see “Reporting Bugs” for information on 
how to do this. 


Potential Problems 


Although bash has been installed on a large number of different machines 
and operating systems, there are occasionally problems. Usually the problems 
aren’t serious and a bit of investigation can result in a quick solution. 


If bash didn’t compile, the first thing to do is check that configure guessed 
your machine and operating system correctly. Then check the file NOTES, 
which contains some information on specific Unix systems. Also look in 
INSTALL for additional information on how to give configure specific 
compilation instructions. 


Installing bash as a Login Shell 
See Recipe 1.11. 


Examples 


See Appendix B for examples included with bash. 


Who Do I Turn To? 


No matter how good something is or how much documentation comes with it, 
you'll eventually come across something that you don’t understand or that 
doesn’t work. In such cases it can’t be stressed enough to carefully read the 
documentation (in more casual computer parlance: RTFM. In many cases, 
this will answer your question or point out what you’re doing wrong. 


Sometimes you'll find this only adds to your confusion or confirms that there 
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is something wrong with the software. The next thing to do is to talk to a 
local bash guru to sort out the problem. If that fails, or there is no guru handy, 
you'll have to turn to other means (currently only via the internet. 


Asking Questions 


If you have any questions about bash, there are currently many ways to go 
about getting them answered. You can email questions to help-bash@gnu.org 
or bash-maintainers@gnu.org, or you can post your question to the USENET 
newsgroup gnu. bash.bug (perhaps via 
https://groups.google.com/forum/#!forum/gnu.bash.bug. There are also more 
generic help sites, such as StackOverflow, Linux Stack Exchange, and so 
forth. 


When asking a question, try to give a meaningful summary of your question 
in the subject line—see “How to Ask Questions the Smart Way” by Eric 
Raymond. 


Reporting Bugs 


Bug reports should be sent to bug-bash@gnu.org, and include the version of 
bash and the operating system it is running on, the compiler used to compile 
bash, a description of the problem, a description of how the problem was 
produced, and, if possible, a fix for the problem. The best way to do this is 
with the bashbug script, installed with bash. 


Before you run bashbug, make sure that you’ve set your SEDITOR 
environment variable to your favorite editor and have exported it (bashbug 
defaults to Emacs, which might not be installed on your system. When you 
execute bashbug it will enter the editor with a partially blank report form. 
Some of the information (bash version, operating system version, etc. will 
have been filled in automatically. We’ll take a brief look at the form, but 
most of it is self-explanatory. 


The From field should be filled out with your email address. For example: 


= From: confused@wonderland.oreilly.com 
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Next comes the Subject field; make an effort to fill it out, as this makes it 
easier for the maintainers when they need to look up your submission. Just 
replace the line surrounded by square brackets with a meaningful summary of 
the problem. 


The next few lines are a description of the system and should not be touched. 
Then comes the Description field. You should provide a detailed 
description of the problem and how it differs from what is expected. Try to be 
as specific and concise as possible when describing the problem. 


The Repeat -By field is where you describe how you generated the problem; 
if necessary, list the exact keystrokes you used. Sometimes you won’t be able 
to reproduce the problem yourself, but you should still fill out this field with 
the events leading up to the problem. Attempt to reduce the problem to the 
smallest possible form. For example, if it was a large shell script, try to 
isolate the section that produced the problem and include only that in your 
report. 


Lastly, the Fix field is where you can provide the necessary patch to fix the 
problem if you’ ve investigated it and found out what was going wrong. If you 
have no idea what caused the problem, just leave the field blank. 


TIP 


If the maintainer can easily reproduce and then identify the problem, it will be 
fixed faster—so make sure your Repeat-By (and ideally Fix) sections are as 
good as you can make them. Reading the article mentioned in “Asking 
Questions” is also encouraged. 


Once yov’ve finished filling in the form, save it and exit your editor. The 
form will automatically be sent to the maintainers. 


876 


Index 


Symbols 
! (negation) operator, Discussion, Discussion 
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$SPROMPT_ COMMAND, Discussion, Discussion 
$PROMPT_DIRTRIM, Discussion 
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$PS1 prompt, Solution, Discussion, Discussion 
$PS2 prompt, Discussion, Solution, Discussion, Solution 
$PS3 prompt, Solution, Discussion, Solution 
$PS4 prompt, Discussion, Solution, Solution 
$PWD variable, Discussion 
$RANDOM variable, Solution-Discussion 
not available in dash, Solution 
$REPLY variable, Solution 
STMOUT environment variable, Solution 
$TMP variable, Solution 
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SUMASK variable, Discussion 
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${} variable, Solution, Problem 
${!prefix*} syntax, Discussion 
${!prefix@} syntax, Discussion 
${#V AR}, Discussion 
${1:0:1} syntax, substring of a shell variable, Discussion, Problem 
${:+} syntax, Solution 
${:-} syntax, Solution 
${:?} syntax, Solution 
${VAR#alt}, Discussion 
% remainder operator, Discussion 
& (ampersand), running a command in the background, Solution 
&& operator 
conditional execution with, Solution, Discussion 
separating commands run sequentially, Solution 
&> redirection operator, Discussion 
&>> redirection operator, Discussion 
'' (single quotes) 
enclosing literal strings in shells, Solution 
in alias definitions, Discussion 
in prompts, Discussion 
using in regular expression, Discussion 
using with strings to preserve spacing, Discussion 


(( )) double parentheses, Discussion, Discussion 
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around if expressions, Discussion 
in special for loop, Solution 
0 (parentheses) 
grouping in regular expressions, Discussion 
in function definitions, Discussion 
running commands in a subshell, Solution, Discussion 
* (asterisk), Problem 
in pattern matching, Discussion, Discussion 
in regular expressions, Searching with More Complex Patterns 
multiplication operator, Discussion, Discussion 
** operator, Discussion 
+ (plus sign), Discussion 
+(... ) grouping syntax for extended pattern matching, Discussion 
date command and, Discussion 
, (comma) operator, Discussion 
, (comma), adding to numbers, Problem 
- (dash) 
filenames beginning with, Problem 
leading - in trap arguments, Discussion 
single trailing dash on the shell, Solution, Discussion 
- (minus sign) 
on printf format specifier, Discussion 
using to close a file descriptor, Discussion 
-I argument, printf, Discussion 
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dot directory, Discussion 
for current directory, Discussion 
adding current directory to $PATH, Discussion 
in regular expressions, Searching with More Complex Patterns 
prefixing commands in current working directory, Discussion 
showing all hidden (dot) files in current directory, Problem 
using instead of source, Discussion 
.* regular expression, Discussion 
./ (dot slash) syntax, Discussion 
/ (slash) 
filename pattern ending with, Discussion 
in absolute pathnames, Discussion 
indicating root of filesystem, Discussion, Discussion, Discussion 
substitution operator for variable references, Discussion 
using to reference script in current directory, Solution 
24-hour time, Discussion 
: (colon), Solution 
modifiers on history commands, Discussion 
:+ variable operator, Discussion 
:- operator, Discussion, Discussion 
:; Syntax, Discussion 
:= (assignment) operator, Solution, Problem 
; (semicolon) 
escaping in find command, Discussion 


separating commands run in sequence, Solution 
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serving Same purpose as newlines, Discussion 
trailing ; within {} used to group commands, Solution, Discussion 
;& syntax, Discussion 
3; (double semicolon), ending statements associated with a pattern, 
Discussion 
< redirection operator, Discussion, Discussion 
0<& or <& syntax, closing STDIN file descriptor, Discussion 
<< indicating here documents, Solution 
<<- syntax, indenting here documents, Solution 
= (assignment) operator, Solution 
= (equality) operator, Problem 
string comparator in double-bracket syntax, Discussion 
= (equals sign), in arithmetic operations, Discussion 
== (equality) operator, Problem 
=~ operator, Solution, Discussion, Discussion 
>, redirecting output, Problem-Solution 
>&, Discussion 
>| syntax, Solution 
using file descriptor numbers with, Discussion, Discussion 
>> redirection operator, Solution, Discussion 


? (question mark), matching any single character, Discussion, 
Discussion, Discussion 


2(... ) syntax, Discussion 
@ (at sign) 


@(... ) grouping syntax for extended pattern patching, Discussion 
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@P operator, Discussion 
adding trailing @ to $SSH_USER, Discussion 
[ left square bracket, Discussion, Discussion 
in if statements, Discussion 
[:alpha:] character class, Discussion 
[:blank:] character class, Discussion, Discussion 
[:digit:] character class, Discussion 
[:space:] character class, Discussion 
[@] notation, Discussion 
[[ ]] double brackets syntax, Discussion 
[] (square brackets) 
character classes in regular expressions, Searching with More 
Complex Patterns 
for test command, Discussion 
in pattern matching, Discussion, Discussion 
in while loops, Discussion 
test operators used with [ | and [[ |], Test Operators 
\ (backslash) 
before commands, avoiding aliases, Discussion 
disabling alias expansion for any command, Discussion 
escape sequences in sed, Solution 
escape sequences in tr utility, Discussion 
escaping spaces in a regular expression, Discussion 


escaping special characters in regular expressions, Searching with 
More Complex Patterns 


shell escape character, Discussion 
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suppressing alias expansion, Discussion 
\n (newline) 
\{n,m\}, {n}, or \{n,\} in interval expressions, Searching with More 
Complex Patterns 
^ (caret) 
case conversion with, Solution, Discussion 


in regular expressions, Searching with More Complex Patterns, 
Discussion, Discussion, Discussion 


negating character classes in pattern matching, Discussion 


negating character classes in regular expressions, Searching with 
More Complex Patterns 


substitution mechanism, Solution 
` (backquotes), Discussion, Discussion, Discussion 
{x..y} brace expansion, Discussion 
{} (curly braces) 
enclosing variable names, Solution, Discussion 
escaping in find command, Discussion 
forming more precise branching blocks with, Discussion 
grouping commands in a code block, Discussion 
grouping commands to run in subshell, Discussion 
in evaluation of shell variables, Discussion 
using in pattern matching to prevent alphabetization, Discussion 
using to group commands, Solution, Discussion 
| (pipe symbol), Discussion, Command-Line Processing Steps 
>| redirection syntax, Discussion 


linking sequence of multiple commands with, Solution 
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logical OR in pattern matching, Discussion 


redirecting both standard output and standard error into a pipe, 
Discussion 


sending output to the next program, Solution 
swapping STDERR and STDOUT before pipe redirection, Solution 
using tee command in piped I/O, Solution, Discussion 
|\& syntax, redirecting standard output and standard error into a pipe, 
Discussion 
|| operator, Solution, Discussion 
~ tilde expansion, Discussion 
~, indicating home directory, Solution 
~/bin directory, creating and adding to path, Solution 
A 
-a (logical AND) operator, Solution 
-a option, type and which, Discussion 
absolute paths, Discussion 
administrative and housekeeping tasks, Housekeeping and 
Administrative Tasks-Discussion 
adding a prefix or suffix to output, Problem 
capturing file metadata for recovery, Problem 
circular backups, Problem 
clearing the screen on logout, Problem 
commifying numbers, Problem 
counting differences in files, Problem 
creating index of many files, Problem 


editing a file in place, Problem 
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emulating DOS pause command, Problem 
finding lines in one file but not in another, Problem 
finding out if a process is running, Problem 
grepping ps output without getting the grep process, Problem 
keeping the most recent N objects, Problem 
using a for loop, Discussion 
using func_shift_by, Solution 
using func_shift_by in production, Discussion 
logging an entire session or batch job, Solution 
numbering lines in files, Problem 
prepending data to existing file, Problem 
recovering disconnected sessions with screen, Problem 
removing or renaming files with special characters, Problem 
renaming many files, Problem 
sharing a single bash session, Problem 
unzipping many ZIP files, Problem 
using diff and patch, Problem 
using GNU info and Texinfo on Linux, Problem 
using sudo on a group of commands, Problem 
writing sequences, Problem 


writing to a circular log, Problem 


administrator accounts, Solution 


AIX, getting bash for, Discussion 


alias command, Solution 


alias expansion, commands, Discussion 


aliases, Discussion, Solution 
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avoiding, Problem 

avoiding with \ before command, Discussion 

clearing all, for security, Problem 

for getting to the bottom of things, Solution 

redefining commands with, Problem 
ANSI color escape sequences, ANSI Color Escape Sequences 
ANSI escapes in prompts, working around, Discussion 
appending output, Solution 
apropos command, Discussion 
archiving files, Solution 
Argument list too long errors, Problem 
arguments 

counting for a script, Problem 

getting default value of, Problem 

in printf statements, best practices for, Discussion 

parsing command-line arguments with case statement, Problem 

passed to a script, looping over, Problem 

printing to screen with echo, Discussion 

removing after handling in scripts, Problem 

reusing, Problem 

using output as, to connecting two programs, Problem 
ARG MAX value, Solution, Discussion, Discussion 
arithmetic, Shell Logic and Arithmetic 

calculator using shell arithmetic and RPN notation, Problem 


creating a command-line calculator with floating-point arithmetic, 
Problem 
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in awk, Discussion 
integer arithmetic in bash for loops, Discussion 
performing in a shell script, Problem 
while loop for arithmetic conditions, Solution 
with dates and time, Problem 
arithmetic expansion, Discussion 
arithmetic expressions, Discussion 
arrays 
associative arrays in awk, Solution 
using to create a histogram, Solution 
associative arrays in bash, Solution 
using to create a histogram, Solution 
parsing output into, Problem 
parsing words into, using read -a, Problem 
using an array with case conversion substitution, Solution 
using array variables, Problem 
arrow keys, using to scroll through commands, Solution 
ASCII 
tab and space characters, Discussion 
table of ASCII values, Table of ASCII Values 
Asciidoc, wrapper for tool, Discussion 
assignment operators, Discussion 
assignments 
cascaded, Discussion 


command not found on, Problem 
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awk utility, Intermediate Shell Tools I, Solution 
-F (field separator) option, Discussion 
calculator using floating-point arithmetic expressions from, Solution 
converting datafile to CSV, Solution 
counting string values, Problem 
creating histogram of some data, Problem 
parsing ifconfig output to find IP address, Discussion 
piping df command output into, Discussion 
piping ls output into and paring it down, Problem 
printing first word of lines of input, Solution 
printing out fields, Discussion 
reversing word order of input lines, Problem 
showing paragraph of text after found phrase, Problem 
summing a list of numbers, Problem 
using to combine or convert field separators, Discussion 
using to isolate fields in data, Solution 
using to number lines in a file, Discussion 
using to parse a CSV datafile, Solution 
using to trim leading or trailing whitespace, Discussion 
using to update fileds in a datafile, Solution 
using with last to add prefix or suffix to output, Solution 
writing sequences with, Solution 

B 

background 
bg command, Discussion 


running a command in, Discussion, Solution 
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running script in, Problem 
backups 
circular, Problem 
keeping N most recent directories, Problem 
basename command, Discussion 
basename, using bash string manipulation for, Problem, Discussion 
bash, Beginning bash 
adding new features using loadable bulletins, Adding New Features to 
bash Using Loadable Builtins-[mproving Programmable Completion 
building from source, Building bash from Source-Reporting Bugs 
compiled with --enable-coprocesses, Solution 
compiled with --enable-net-redirections, Solution 
finding your IP address, Solution 
computing and drawing a histogram, Problem 
configuring and customizing (see configuring and customizing bash) 
counting string values with, Problem 
decoding the prompt, Problem 
documentation, learning more about, Problem 


examples included with, Examples Included with bash-bash 
Documentation and Examples 


finding and running commands, Problem 
finding portably for #!, Problem 
getting for BSD-based systems, Problem 


getting for Linux, Problem 


finding bash versions for distributions, Discussion 
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getting for macOS, Problem 

getting for Unix, Problem 

getting for Windows, Problem 

invoking, options for, bash Invocation 

keeping updated, Problem 

manpage for, Discussion 

-n option, Discussion 

network redirection, using to log to syslog from scripts, Solution 
new variables to support debugging as of version 3.0, Discussion 
philosophy, Intermediate Shell Tools I 

reasons for using, Why bash? 

reserved words, bash Reserved Words 

running in POSIX mode, Discussion 

setting as default shell, Problem 

trying out without buying or building, Problem 


version 3.2, Solution 


bash --version command, Solution 


bash-completion-20060301.tar.gz library, Solution, Discussion 


bashrc file, Solution 


example file, Solution 


$BASH_REMATCH variable, Discussion 
batch files, Solution 


(see also shell scripts) 


Bazaar, Bazaar-Mercurial 


be program, using as a coprocess, Solution 
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bdiff utility, Solution 
BEGIN and END patterns (awk), Discussion 
BEGIN keyword, Discussion 
bg (background) command, Discussion 
/bin directory, creating and adding to path, Discussion 
bind command, Solution 
bit bucket, Discussion 
[:blank:] character class, Discussion 
bot or bottom (see aliases, for getting to the bottom of things) 
Bourne shell, Beginning bash, Why bash? 
restricted version, rsh, Discussion 
branching on conditions, Problem-Problem 
branching many ways, using case statement, Problem 
if statements, Solution 
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BSD-based systems, Discussion, Solution 
ARG MAX limits, Discussion 
bash in /usr partition, Discussion 
chsh -l command, listing and editing shell, Solution 
echo command on, Discussion 
getting bash for, Problem 
hexdump utility, Discussion 
MAC implementation, Discussion 
on virtual machines, Discussion 
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which utility, Discussion 
buffer overflow attacks, Discussion 
buffering, buffered STDOUT versus unbuffered STDERR, Discussion 
builtin command, Discussion, Discussion 
using to avoid shell functions and aliases, Solution 
using to redefine a builtin command, Discussion 
builtin commands, Discussion 
enable -a, listing builtins and enabled/disabled status, Discussion 
enable -n, turning off with, Discussion 
reference listing of, Builtin Commands 
builtins, loadable (see loadable builtins) 
bzip2 utility, Solution 
-j option, Solution 
C 
C language, code for loadable builtins, Discussion 
C shell (csh), Beginning bash 
call by value (exported variables), Discussion 
case 
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converting to camel case, Problem 
finding files irrespective of, Problem 
ignoring in grep search for text, Problem 
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case statement, Problem 
parsing command-line arguments with, Problem 
patterns in, following rules of pathname expansion, Discussion 
using in parsing arguments with getopts, Discussion 
using to validate input, Solution 
case..esac blocks, Discussion 
cat command, Discussion 
example, redirecting output to file, Solution 
using in while loop, Discussion 
using to number lines in a file, Solution 
using to prepend data to a file, Problem 
using here-document or here-string, Discussion 
using with here document to input HTML in script, Discussion 
zcat for compressed files, Solution 
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-L and -P options, Discussion 
creating a better command, Problem 
defining shell function to change how it works, Discussion 
running rm command only if cd succeeds, Problem 
setting your $CDPATH, Problem 
cdrecord program, Solution 
CDs, burning, Problem 
character classes, Pattern-Matching Characters 
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counting in a file using wc command, Problem 
chmod command, Solution 
using with find and xargs, Discussion, Solution 
using with four-digit octal modes, Solution 
chpass command, Solution 
chroot command, Solution 
chroot jails, Solution, Problem 
chsh -l command, Solution 
chsh -s command, Solution 
clear command 
putting in .bash_logout file, Solution 
setting trap to run on shell termination, Solution 
clobbering files 
accidentally, with uniq command, Discussion 
during output redirects, Problem 
noclobber option and, Discussion 
on purpose, Problem 
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comm utility, Solution, Discussion 


comma-separated values (CSV) 
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converting a datafile to, Problem 
parsing a CSV datafile, Problem 


using alternate values for, Problem 
command alias, Solution 
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command -p, Solution 
using to ignore shell functions and aliases, Solution 
command hash, clearing, Problem 
command keyword, prefixing commands with, Discussion 
command prompt ($PS1), Discussion 
command substitution, Solution, Discussion 
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eval command, eval 
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commands 
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not found, better error message for, Problem 
redefining with alias, Problem 
repeating the last command, Problem 
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running almost the same command, Problem 
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command_not_found_handle function, redefining, Discussion 
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-F option, Discussion 
completion (programmable) 
filename completion, Solution 
finishing pathnames with Tab key, Solution 
improving, Problem-Discussion 
initialization files for, Solution 
compound commands, Discussion, Command-Line Processing Steps 
compress utility, Solution 
compressed files 


checking tar archive for unique directories, Problem 
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using grep on, Problem 
compressing files, Problem 
common file extensions and compression utilities, Solution 
uncompressing files, Problem 
using tar, Discussion 
concatenation, Discussion 
(see also cat command) 
configuration files 
using external configuration files in scripts, Problem 
using in scripts with includes and sourcing, Problem 
configuring and customizing bash, Configuring and Customizing bash- 
Discussion 


adding new features using loadable builtins, Adding New Features to 
bash Using Loadable Builtins 


adjusting readline behavior using .inputrc, Problem 
adjusting shell behavior and environment, Problem 
changing your path permanently, Solution 
changing your path temporarily, Problem 
functions for, Solution 
creating a better cd command, Problem 
creating and changing into new directory in one step, Problem 
creating self-contained, portable rc files, Problem 
customizing the prompt, Problem 
getting started with a custom configuration, Problem 


getting to the bottom of the directory structure, Problem 
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improving programmable completion, Problem 
keeping a private stash of utilities by adding ~/bin, Problem 
prompt before the program runs, Problem 
setting shell history options, Problem 
setting your $CDPATH, Problem 
common directories in $CDPATH, Discussion 

shortening or changing command names, Problem 
startup options, Problem 
synchronizing shell history between sessions, Problem 
using initialization files correctly, Problem-Problem 
when programs are not found, Problem 

control structures, Shell Logic and Arithmetic 

coproc command, Solution 

core dumps, preventing, Problem 

cp command 
copying one file on top of another, Discussion 
cp -al, Discussion 

CPIO files, Discussion 

cron utility, Discussion 
entries for script keeping an eye on something, Discussion 
escaping % to avoid errors, Discussion 
using date and cron to run a job on the Nth day, Problem 
using keychain script with, Solution 
using to purge data, Discussion 


using to send email from scripts, Solution 
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cross-platform shell scripts, Advanced Scripting 
(see also portable scripts) 
writing on Linux, problems with, Discussion 
crypt, Discussion 
CS_PATH, Discussion 
curl utility, Solution 
current working directory 
displaying in prompts, Discussion 
not in the $PATH, Problem 
cut command, Solution, Solution 
-c option, Solution 
using fields, Discussion 
using in renaming files, Solution 
CVE-2014-6271 (shellshock vulnerability), Discussion 
Cygwin, Why bash? 
about, Cygwin 
downloading and installing on Windows, Solution 
D 
daemon, running a script as, Problem 
dash (Debian Almquist shell), Solution 
$RANDOM variable and, Solution 
devscripts package for bashisms not working on dash, Discussion 
data injection attacks, Discussion 
data validation, Discussion 


databases, creating and initializing using MySQL, Problem 
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date command, Working with Dates and Times 
(see also dates and time; GNU date) 
with -d or --date argument of tomorrow, Discussion 
dates and time, Working with Dates and Times-Discussion 
SHISTTIMEFORMAT variable, Discussion 
arithmetic with, Problem 
automating date ranges, Solution 
circular series, Discussion 
converting epoch seconds to, Problem 
converting to epoch seconds, Problem 
counting elapsed time, Problem 
finding files by date, Problem 
formatting for output, Problem 
getting yesterday and tomorrow's dates, using Perl, Problem 
handling time zones, Daylight Saving Time, and leap years, Problem 
in printf format, Examples 
logging with dates, Problem 


string formatting with strftime, Date and Time String Formatting with 
strftime 


supplying a default date, Problem 


using date and cron to run a job on the Nth day, Problem 
day of week, running a cron job on, Solution 
Daylight Saving Time, Working with Dates and Times 
handling, using tools for, Discussion 


dbiniter script, Solution 
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Debian, Discussion, Discussion 

.deb files, Discussion 

devscripts package, Discussion 

getting bash for, Solution 

net-redirections, Solution 

which utility, Discussion 
Debian Almquist shell (see dash) 
DEBUG signal, Discussion 
debugger, Bash Debugger Project, Discussion 
debugging, Advanced Scripting, Problem 

long sequence of piped I/O, Problem 

new variables to support debugging in bash 3.0, Discussion 
declare statements, Discussion 

-F option, Discussion 

declare -p command, Solution 

output showing variable names and values, Discussion 

-delete action (find), Discussion 
delimiters, Discussion 

cut command, using open and closed square bracket, Discussion 
/dev/null 

redirecting grep output to, Discussion 

redirecting output to, Solution 
df command, Discussion 
dictionaries, Solution 


diff command, Discussion, Problem 
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-p argument, Discussion 
-r and -N arguments, Discussion 
counting hunks in diff output, Solution 
finding lines in one file but not in another, Solution 
output, various forms of, Discussion 
treating all files as ASCII and setting language and time zone to 
universal defaults, Solution 

digits, matching, Discussion 

directories 
adding current directory to the $PATH, Problem 
commands for file information, adapting, Discussion 


commands in current working directory, prefixing with . (dot), 
Discussion 


creating and changing into new directory in one step, Problem 
getting to the bottom of the directory structure, Problem 
home directory, Solution 
in $PATH shell variable, Discussion 
in PATH environment variable in bash, Discussion 
movng quickly among, Problem 
providing for find command, Discussion 
showing which directory you are in, Problem 
to include in $CDPATH, Discussion 
dirname command, Discussion 
using string manipulation instead of, Problem 
dirs command, Discussion 


-p option, Discussion 
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DistroWatch.com, Discussion 
divert program, Solution 
documentation 
documenting scripts with comments, Solution 
embedding in shell scripts, Problem 
for bash, learning more about, Problem 
on commands, Discussion 
documents, comparing, Problem, Discussion 
(see also diff command) 
in word processors, Document Comparison 
DOS files, converting to Linux format, Problem 
DOS pause command, emulating, Problem 
dos2unix program, Solution 
dot directory, Discussion 
dot files 
. and .. files, excluding from file listings, Solution 
showing all in current directory, Problem 
duplicates, removing, Discussion 
dynamic loading, using for loadable builtins, Discussion 
E 
-e (exit) option, Discussion 
echo command, Discussion 
defined as an alias, Discussion 
echo -e command, Discussion 


echo -n command, Solution 
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in configuration files, problems with, Discussion 
options and escape sequences, echo Options and Escape Sequences 
prefacing rm command with echo, Discussion 
quoting strings to preserve spacing in output, Problem 
redirecting output to a file, Discussion 
returning exit status, Discussion 
searching for secure temporary directory, Solution 
seeing what the shell will pass to scripts and functions, Discussion 
sending function output to STDOUT, Solution 
using echo * as alias for Is, Discussion 
using for shell output to terminal/window, Solution 
using portably in scripts, Problem 
using to see results of pattern match, Solution 
using to test file renaming, Discussion 
writing output without newline, Problem 
echoing 
turning off in read statement, Solution 
turning off using stty -echo, Discussion 
ed utility, Solution 
script stored in a file, Discussion 
editors, Discussion, Discussion, Solution 
ed, Discussion 
invoked by fc command, Discussion 
replacing tabs with spaces, Discussion 


streaming editor (see sed utility) 
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vi and ex, use of !, Discussion 

vi and sed, use of slash (/), Discussion 
elif clause, Solution 
elm, Discussion 
else clause, Solution 
Emacs mode commands, Emacs Mode Commands 
email 

finding email address in grep output, Discussion 

from cron jobs, Discussion 

sending from scripts, Problem-Problem 
empty strings, as valid default value for variables, Problem 
enable command, Discussion, Discussion 

enabling and disabling tty loadable builtin, Discussion 
END keyword, Discussion 
end-user tasks as shell scripts, End-User Tasks as Shell Scripts- 
Discussion 

burning a CD, Problem 

comparing two documents, Problem 

loading MP3 files into player, Problem 

printing a line of dashes, Solution 

viewing photos in an album, Problem 
env command, Discussion, Solution, Solution 
environment 

adjusting bash shell environment, Discussion 


system-wide environment settings, Solution 
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environment variables 
available in bash 4.4, reference listing, Builtin Shell Variables 
inability to modify in shell scripts, Discussion 
passwords read into, Discussion 
setting default values, Problem 
used by programs like InfoZip, Discussion 
EOF (end of input) marker, Discussion 
escaping to turn off shell scripting, Solution 
leading characters preventing recognition of, Discussion 
not quoting in photo album script, Discussion 
trailing whitespace or characters preventing recognition of, Discussion 
epoch 
converting dates and time to epoch seconds, Problem 
converting epoch seconds to dates and time, Problem 
using epoch seconds for date and time arithmetic, Discussion 
eq operator, Problem 
equality, testing for, Problem-Problem 
determining which operator to use, Solution 
ERR signal, Discussion 
error messages 
displaying when command execution fails, Problem 
for case statement parsing arguments, Discussion 
giving for unset parameters, Problem 
handling in photo album script, Discussion 


including error output in tee output file, Discussion 


908 


redirecting and appending to same file as output, Discussion 
redirecting to different files, Solution 
redirecting to standard error for a function, Discussion 
saving when redirect isn’t working, Solution 
searching for from previous command, using grep, Discussion 
sending to same file as output, Solution 
swapping STDERR and STDOUT before pipe redirection, Problem 
writing your own for parsing with getopts, Problem 

esac, ending case statements, Discussion 

escape sequences 
accepted by echo, echo Options and Escape Sequences 
backslash escape sequences in sed, Solution 
echo command, \n (newline), Discussion 
escaping EOF marker for here document, Discussion 
in fancy bash prompts, Fancy prompts 

working around ANSI or xterm escape sequences, Discussion 

in printf, printf 
in tr utility, Discussion 

/etc/bashrc file, Discussion, Solution 

/etc/inputre file, Solution 

/etc/passwd file, Solution 

/etc/profile file, Discussion, Solution 

/etc/shells file, Solution 

/etc/sudoers file, Solution 


etckeeper, etckeeper-Other 
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eval command, eval 
running lesspipe script in, Solution 
ex utility, Discussion 
examples included with bash, Examples Included with bash-bash 
Documentation and Examples 
exec command, Discussion 
redirecting STDOUT or STDERR, Solution 
executables 
function name as, Discussion 
keeping in personal bin directory, Discussion 
execute permissions on files, Discussion 
forgetting to set, Problem 
execution, Executing Commands-Discussion 
displaying error messages for failures, Problem 
exec for find utility, Discussion 
running a command only if another command succeeds, Problem 
running any executable, Executing Commands-Problem 
running commands from a variable, Problem 
running long jobs unattended, Problem 
running several commands in sequence, Problem 
success or failure of command execution, Problem 
using fewer if statements to check for command return codes, Problem 
exit statement, Discussion, Discussion 
exit status, Discussion 


adding exit 0 before documentation, Discussion 
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assigning to a shell variable, Discussion 
exiting bash on encountering failure (nonzero exit status), Discussion 
getting value for, Discussion 
of statements in while loops, Discussion 
script terminated by signals, Solution 
using to run another command after first command succeeds, Solution 
expand_aliases option, unsetting, Discussion 
export command, Discussion 
export -p, Discussion, Solution 
exporting PATH, Discussion 
export statements, Discussion 
exporting variables, Problem 
expecting to change exported variables, Problem 
extdebug option, Discussion 
extended pattern matching, Discussion, Discussion 
file globbing patterns and unzip utility, Solution 
external commands, Discussion 
forcing use before any builtins or functions, Discussion 
extglob option, Discussion, Discussion 
Extra Packages for Enterprise Linux (EPEL), Solution 
F 
FAQ (bash), Official documentation 
fc command, Solution 
feature creep, Solution 


fg (foreground) command, Discussion 
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fields, Discussion 


extracting from lines of input, Problem 


internal field separator environment variable (see $IFS, under 
Symbols) 


updating specific fields in a datafile, Solution 
using cut command to print out, Discussion 
file command, Problem 
checking if line endings are wrong, Solution 
giving type of file, Solution 
options for output format, Discussion 
file descriptors, Discussion, Discussion, Solution 
closing file descriptor in STDIN, Discussion 
STDIN, STDOUT, and STDERR, Discussion 
swapping STDERR and STDOUT before pipe redirection, Discussion 
filename expansion, Discussion 
filenames 
as arguments to shell commands, Discussion 
containing odd characters, handling with find, Problem 
converting between upper- and lowercase, Problem 
file extensions and compression utilities, Solution 
getting from a search, Problem 
renaming files with wrong suffix, Problem 
shell parameters containing, quoting, Solution 
using bash for basename, Problem 


files 
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capturing metadata for recovery, Problem 
clobbering on purpose, Problem 
compressing, Problem 
counting differences in, Problem 
counting lines, words, or characters in, Problem 
creating index of many files, Problem 
deleting using an empty variable, Problem 
displaying or using beginning or end of, Problem 
DOS, converting to Linux format, Problem 
editing in place, Problem 
finding, Finding Files: find, locate, slocate-Discussion 
using find, Finding Files: find, locate, slocate 
using list of possible locations, Problem 
using locate and slocate, Solution 
finding lines in one file but not in another, Problem 
getting information about, Problem 
input and output, connecting program to, Standard Output 
keeping files safe from accidental overwriting, Problem 
naming for tar utility, Discussion 
numbering lines in, Problem 
on Unix, Standard Output 
prepending data to existing file, Problem 
reading entire file and then parsing it, Problem 
removing or renaming files with special characters, Problem 
renaming many files, Problem 


searching for string in, Problem 
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showing all hidden (dot) files in current directory, Problem 
testing for characteristics, Problem-Problem 
uncompressing, Problem 
while loop for filesystem-related conditions, Solution 
with no line breaks, processing, Problem 
filters, Discussion, Intermediate Shell Tools II 
find command, Problem, Solution 
finding files across symbolic links, Problem 
finding files by content, Solution 
finding files by date, using -mtime predicate, Problem 
finding files by size, Problem 
finding files by type, Problem 
handling filenames with odd characters, Problem 
-l option, Discussion 
options for output format, Discussion 
speeding up operations on found files, Problem 
using -iname predicate to run case-insensitive search, Problem 
using GNU find and printf formats to capture file metadata, Solution 
using in alias to get to bottom of things, Discussion 
using output as arguments to rm command, Solution 
using with chmod, Solution 
using with head, grep, or other commands to index files, Solution 
using with xargs, Solution 
fingerprints, support by SSH, Discussion 
Firefox, script to back up and restore sessions, Discussion 


fixed-length records, processing, Problem 
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adding newline after each record, Solution 
floating-point numbers 
arithmetic in awk using, Discussion 
calculator using floating-point arithmetic, Problem 
looping with floating-point values, Problem 
fmt command, Problem 
for loops 
awk language, Discussion 
converting filenames from upper- to lowercase, Solution 
integer arithmetic in, Discussion 
looping over arguments passed to a script, Solution 
looping with a count, Problem 
special syntax, Discussion 
looping with floating-point values, Problem 
searching for files in several possible locations, Solution 
using "$@" in, Discussion 
using portably, Problem 
using to break up too-long argument lists, Solution 
using to rename many files, Solution 
using to unzip many files, Solution 
wrapping SSH command to run on multiple hosts, Solution 
forced commands (SSH), Solution 
fork(2) manpage, Discussion 
format specifications (printf), Discussion, printf 


formatting dates and time, Problem 
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FORTRAN operators, similar to bash operators, Discussion 
Fox, Brian, Beginning bash 
FreeBSD, Problem, Discussion 
(see also BSD-based systems) 
compiling and linking tty loadable builtin, Discussion 
getting bash for, Solution 
which utility, Discussion 
from host restriction, Solution 
$FUNCNAME array, Discussion, Discussion 
function reserved word, Discussion 
functions 
avoiding and executing the actual command instead, Problem 
avoiding command not found when using, Problem 
C functions for loadable builtins, Discussion 
date-related shell functions, Discussion, Solution 
defining, Problem 
forms of function definition, Discussion 
func_mced (example), creating and changing into a directory in one 
step, Solution 
parsing program output with a function call, Problem 
security measures for, Discussion 
turning off shell functions with command, Discussion 
using meaningful names for, Discussion 
using parameters and return values, Problem 
using to redefine how builtins work, Discussion 


using to rename or tweak commands, Solution 
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G 
gawk utility, Working with Dates and Times 
(see also awk utility) 
using to process fixed-length records, Solution 
getconf command, Solution 
getconf ARG_ MAX, Discussion 
setting secure path, Solution 
getline command, Discussion 
getopt command, Parsing and Similar Tasks 
for loadable builtins, Discussion 
getopts command, Solution 
writing your own error messages for parsing, Problem 
Git, Git-Bazaar 
globbing (extended pattern matching), Discussion, Discussion 
file globbing patterns and unzip utility, Solution 
GNU awk (see gawk utility) 
GNU Core Utilities FAQ, Discussion 
GNU date, Working with Dates and Times 
-d option and %s format, Solution 
-d option, using, Discussion 
converting epoch seconds to dates and time, Solution 
documentation for -d option, Solution 
supplying a default date, Solution 
time zones, %z format, Discussion 


GNU Readline library, Discussion 
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(see also readline) 
GNU tar, Solution 
GNU tools, Discussion 
grep command, Discussion, Intermediate Shell Tools I 
-c option, Discussion 
finding files by content, Solution 
finding lines in one file but not in another, Solution 
forgetting to provide input for, Discussion 
getting just filename from a search, using -l option, Problem 
getting simple true/false from a search, Solution 
grepping ps output to find if a program is running, Solution 
grepping ps output without getting the grep process, Problem 
grepping the output of, Discussion 
-h option, Discussion 
-i option, Discussion 
ignoring case with -i option, Problem 
-0 option, Solution 
paring down search finds, Problem 
piping df command output into, Discussion 
piping set command into, Discussion 
searching for text in a pipeline, Problem 
searching through files for a string, Discussion 


searching with more complex patterns, Searching with More Complex 
Patterns 


selecting lines beginning with ?, Discussion 


using in parsing HTML, Solution 
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using instead of -name argument to find, Discussion 
using on compressed files, Problem 
using on output of svn status command, Solution 
using regular expressions with, Discussion 
using to count hunks in diff output, Discussion 
using with find, Solution 
using with find to index files, Solution 
-v option, Discussion 
grouping commands, using {}, Solution 
grouping symbols for extended pattern matching, Discussion 
gsub utility, Discussion 
removing whitespaces used in padding records, Solution 
guest users, restricting, Problem 
using restricted shell, Discussion 
gunzip command, using augmented completion with, Discussion 
gzcat utility, Solution 
gzip utility, Solution 
-Z option, Solution 
H 
-h (help) option, Discussion, Discussion 
hangup (hup) signals, Discussion 
hash -r command, Solution 
hashes, Solution 
head command, Solution 


using with find to index files, Solution 
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header files 
for loadable builtins, Discussion 
prepending to datafiles, Discussion 
headers, skipping in files, Problem 
help 
accessing for commands, Discussion 
for loadable builtins, Discussion, Discussion 
help command, Discussion 
startup options for bash, Solution 
--help option, Discussion 
here documents, Solution 
escaping EOF marker to turn off shell scripting, Problem 
indenting, Problem 
using for documentation embedded in scripts, Solution 
hexdump -C command, piping output through, Solution 
hexdump.pl script, Discussion 
hidden (.dot) files, showing, Problem 
histappend option, Discussion 
histograms 
creating using awk, Problem 
creating using bash, Problem 
history 
initialization file for shell history, Solution 
setting shell history options, Problem 


synchronizing shell history between sessions, Problem 
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history command, Solution, Problem 
history operator (!!), Discussion, Solution 
adding an editing qualifier, Solution 
host alias, Solution 
host restriction, Solution 
hostnames, resolving to IP address, Solution 
hosts, finding external, routable address for, Solution 
housekeeping tasks (see administrative and housekeeping tasks) 
HP-UX 
ARG MAX limits, Discussion 
getting bash for, Discussion 
HTML 
generating pages to view photo album, Solution-Discussion 
parsing from bash, Problem 
hunks (or chunks) in diff output, Discussion 
I 
I/O (input/output), Standard Output 
(see also output) 
breaking up input into fixed sizes, Solution 
changing script behavior with redirections, Problem 
connecting program to files for, Standard Output 
getting input for script from another machine, Problem 
reference list of redirectors, I/O Redirection 
reversing word order in input lines, Problem 


standard input, Standard Input-Discussion 
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getting input from a file, Discussion 
getting input from user, Problem 
getting yes or no input, Problem 
indenting here document, Problem 
keeping data with your script, Problem 
prompting for a password, Problem 
selecting from list of options, Problem 
using output as input to connect two programs, Problem 
validating external input, Solution, Problem 
while loop for reading input, Solution 
if statements, Solution-Discussion 
double parentheses around if expression, Discussion 
general form of, Discussion 
testing for file characteristics, Solution 
testing for more than one file characteristic, Problem 
testing for string characteristics, Solution 
testing with pattern matches, Solution-Problem 
testing with regular expressions, Problem 
tests in, and two kinds of syntax, Discussion 
using exit status in, Discussion 
running second command if first command succeeds, Solution 
using fewer to check for command return codes, Problem 
with elif and else clause, Solution 
if/then/else statement, Discussion 


using case statement instead of, Solution 
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ifconfig utility, Solution 
output examples from different machines, Discussion 
in keyword, Discussion 
-iname predicate (find), Solution 
info command, Problem 
InfoZip packages, zip and unzip, Solution 
initialization files, Problem-Problem 
cheat sheet for the files and what to do with them, Solution 
readline init file syntax, readline Init File Syntax 
input preprocessors, Solution 
-inputrc file, Solution 
initialization file, Solution 
sample file, Solution 
integer arithmetic, Solution 
integer arithmetic expressions, Discussion 
interactive mode, determining if shell is/is not running in, Problem 
internal field separator variable (see $IFS, under Symbols) 
interpreter spoofing attacks, avoiding, Problem 
interval expressions, Searching with More Complex Patterns 
IP addresses 
finding, Solution 
finding for machine you're using, Problem 
regular expression for, Discussion 
sorting, Problem 


ISO 8601 standard for displaying dates and time, Discussion 
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ISO filesystem images, Solution 
is process_running script, Solution 
J 
jobs 
job control, disabling, Discussion 
job number in Linux, Discussion 
running long jobs unattended, Problem 
K 
key pairs (SSH), Solution 
keychain script, Solution 
troubleshooting, Discussion 
using with --clear option, Solution 
kill command, Discussion 
kill -1, Solution 
killing job or process in Linux, Discussion 
POSIX differences affecting, Discussion 
sending SIGTERM signal, Discussion 
textual completion for, Discussion 
-n option or signals, Discussion 
-KILL option, Discussion 
Korn shell (ksh), Beginning bash, Why bash? 
L 
-L option (pwd and cd), displaying logical path, Discussion 
last command, Solution 


adding a suffix to output, Discussion 
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last in, first out (LIFO), Discussion 
lastpipe option, Solution 
working only if job control disabled, Discussion 
leap years, handling, using tools for, Discussion 
less command, Discussion, Problem 
displaying line numbers on the screen, Discussion 
piping info command output into, Discussion 
svn -y log | less, Discussion 
using with compressed files, Discussion 
lesspipe script, Solution 
different versions on different systems, Discussion 
let statements, Solution 
arithmetic and assignment operators in, Discussion 
comma (,) operator and, Discussion 
quoting in arithmetic operations, Discussion 
whitespace in, Discussion 
LibreOffice, Solution 
line breaks, adding to a file, Problem-Problem 
line number (SLINENO), Discussion, Discussion 
lines, numbering, Problem 
Linux 
ARG MAX limits, Discussion 
bash on, Why bash? 
chsh -l command, listing shells with, Solution 


converting DOS files to Linux format, Problem 
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cross-platform shell scripts written on, problems with, Discussion 
/dev/null, Discussion 
echo command on, Discussion 
getting bash for, Problem 
prompt, example of, Solution 
root user, Discussion 
Security Enhanced Linux (SELinux) and MAC, Discussion 
sort order on, Discussion 
sudo on, Discussion 
using GNU info and Texinfo on, Problem 
Vixie-cron, Solution 
lithist option, Discussion 


loadable builtins, Adding New Features to bash Using Loadable Builtins- 
Improving Programmable Completion 


locales 


sort order and, Discussion 


tr command respecting locale's collating sequence, Solution 
locate, Discussion, Solution 
logger 
using correctly, Problem 
-t option, Discussion 
other options, differences in systems and versions, Discussion 
using to send logging from scripts to syslog, Solution 
logging 


capturing output of entire session or batch job, Solution 
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setting up for script using phases, Discussion 

to syslog from your scripts, Discussion, Problem 

writing to a circular log, Problem 

clearing out previous data before writing new data, Discussion 

logical operators 

-a (logical AND) operator, Discussion, Problem 

-0 (logical OR) operator, Problem 

logical AND operator in C, && and, Discussion 

logical AND, OR, and NOT constructs with find command, Discussion 

logical OR operator in C, || syntax and, Discussion 
logouts 

bash_logout file (example), Solution 

clearing screen on logout, Problem 

initialization file, Solution 
long lines of code, breaking, Discussion 
long-form command-line options, Parsing and Similar Tasks 
Is command, Discussion, Intermediate Shell Tools I 

accessing documentation on, Discussion 

command ls, Discussion 

Is -1 command, Solution 

parsing output into an array, Discussion 

Is -A, Solution 

Is -a, showing all files, Problem 

Is -d, Discussion 


pattern matching with, Discussion 
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running alias for, Discussion 

piping into awk to limit output, Problem 

redirecting output to a file, using -C option, Solution 

Trojaned, Discussion 

useful options, Discussion 

using to get more information about a file, Problem 
lynx utility, Solution 
M 
macOS 


bash on, Why bash?, Discussion 
chsh and chpass -s commands, Solution 
current versions shipping with bash 3.2 as /bin/sh, Solution 
getting bash for, Problem 
MAC implementation, Discussion 
sudo on, Discussion 
mailers and message transfer agents (MTAs), Solution 
just enough MTA for cron, Discussion 
mail and mailx, Solution 
mailto, Discussion 
MAILTO variable, Solution 
man command, Discussion 
man in the middle attacks, Discussion 
mandatory access control (MAC) systems, Discussion, Discussion 
manpages, Discussion 


mapfile command, Solution 
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Meld, Meld 
menus 
changing prompt in simple select menus, Problem 
creating a simple menu, Solution 
Mercurial, Mercurial-Subversion 
mkdir -m command, Solution 
mkdir command, Discussion 
mkisofs program, Solution 
mktemp utility, Discussion 
using with fallback to /dev/urandom, Discussion 
modulo, modulus, or mod (see % remainder operator) 
more command, Discussion 
MP3 files 
finding, Problem 
loading MP3 player with, Problem 
mpack, Discussion 
-mtime predicate (find), Solution 
mutt, Discussion 
mv command, Discussion, Discussion 
using with xargs, Discussion 
mysql command, Discussion 
-u option, Discussion 
MySQL, setting up databases, Problem 
N 


\{n,m\}, {n}, or \{n,\} in interval expressions, Searching with More 
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Complex Patterns 
NetBSD, Problem 
(see also BSD-based systems) 
compiling and linking tty loadable builtin, Discussion 
stable sort, Discussion 
Netcat, Solution 
network redirection features 
using bash net-redirections, Problem 
using in Logger, Netcat, or bash, Solution 
Network Time Protocol (NTP), Working with Dates and Times 
newlines 
in bash versus HTML, Discussion 
writing output without, Problem 
next command, Discussion 
nl command, Solution, Discussion 
No such file or directory errors, Problem 
noclobber option, Discussion 
limitations of, Discussion 
overriding using >| redirection syntax, Solution 
nohup command, Solution, Discussion 
NTP (Network Time Protocol), Working with Dates and Times 
nullglob option, Discussion 
nullmailer, Discussion 
numbered variables, Solution 


numbers 
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getting absolute value of, Problem 
in file descriptors, Discussion, Discussion, Discussion 
putting commas in for thousands, Problem 
regular expression matching a Social Security number, Problem 
sorting with sort utility, Problem 
IP addresses, Problem 
summing a list of, using awk, Problem 
0 
-0 (logical OR) operator, Solution 
od (octal dump) command, Discussion 
ODF (OpenDocument Format), Solution 
.odt file extension, Solution 
OpenBSD, Writing Secure Shell Scripts, Discussion, Discussion 
(see also BSD-based systems) 
getting bash for, Solution 
OpenPKG project, Discussion 
OpenSSH, Problem, Discussion 
operating systems 
bash on, Why bash? 
command-line interface, Beginning bash 
shell accounts for, Polarhome, Discussion 
Shell, separation from other parts, Beginning bash 
operators 
arithmetic, Discussion 


assignment, Discussion 
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binary operators testing for file characteristics, Discussion 
case conversion, Solution 
comparison operators in bash, Discussion 
string manipulation, Discussion 
unary operators testing for file characteristics, Discussion 
options 
for loadable builtins, Discussion 
GNU long options, Discussion 
setting shell options, Discussion 
shell options to configure history file handling, Discussion 
specifying on the command line, Parsing and Similar Tasks 
to shell scripts, removing after processing, Problem 
turning on shell options, Discussion 
output, Standard Output-Discussion 
adding a prefix or suffix to, Problem 
appending rather than clobbering, Problem 
connecting program to output file, Standard Output 
connecting two programs using output as input, Problem 
cutting out parts of, Problem 
displaying/using beginning or end of a file, Problem 
from functions, Solution 
keeping files safe from accidental overwriting, Problem 
keeping only a portion of line of output, Problem 
keeping some and discarding the rest, Problem 


preserving spacing in, Problem 


932 


P 


redirecting for the life of a script, Problem 

redirecting to a file, Problem 

redirecting to file with ls command, Problem 

redirecting to files other than in current directory, Problem 
redirecting to several different places, Problem 

saving or grouping from several commands, Problem 

saving output used as input, Problem 

saving when redirect doesn't work, Problem 

sending output and error messages to different files, Problem 
sending output and error messages to same file, Problem 
skipping the header in a file, Problem 

sorting, Problem 

splitting only when necessary, Problem 

swapping STDERR and STDOUT before pipe redirection, Problem 
throwing away, Problem 

using as arguments to connect two programs, Problem 
viewing in hex mode, Problem 

writing to the terminal/window, Problem 

writing with more formatting control, Problem 


writing without newlines, Problem 


-p (prompt) option, read statement, Discussion 


-P option, pwd and cd, Discussion 


package systems, using to update bash and your system, Solution 


paragraphs 


rewrapping lines in, using fmt, Problem 


933 


showing paragraph of text after found phrase, Solution 
parameter expansion, Discussion 

${!prefix*} syntax, Discussion 

${!prefix@} syntax, Discussion 

${#V AR}, Discussion 

${1:0:1} syntax, substring of a shell variable, Discussion, Problem 

${:+} syntax, Solution 

${:-} syntax, Solution 

${:?} syntax, Solution 

${VAR#alt}, Discussion 

removing text that matches a pattern, Solution 
parameters 

for bash functions, Solution 

handling lists of parameters with spaces, Problem 

handling parameters with spaces, Problem 

unset, giving error message for, Problem 

using command-line parameters in shell scripts, Solution 
parsing, Parsing and Similar Tasks-Discussion 

compressing whitespace, Problem 

converting datafile to CSV, Problem 

extracting fields in a datafile and updating them, Problem 

isolating specific fields in data, Problem 

of arguments for shell scripts, Discussion 

of CSV datafile, Problem 


of output into an array, Problem 


934 


of output using a function call, Problem 
processing fixed-length records, Problem 
taking strings apart one character at a time, Problem 
trimming whitespace from lines for fields of data, Problem 
using bash to parse HTML, Problem 
using getopts command to parse shell script arguments, Problem 
writing your own error messages, Problem 
using read -a to parse words into an array, Problem 
using read statement to parse text, Problem 
using to make words plural, Problem 
passwd -e command, Solution 
passwd -l command, Solution 
passwords 
editing, Solution 
hardcoding in a script, Problem 
leaking into the process list, Problem 
prompting user for, Problem 
protecting from access, Discussion 
patch command, Solution 
--dry-run option, Solution 
-Np1 arguments, Solution 
patches 
applying a patch file, Solution 
creating with diff, Problem 


PATH environment variable, Discussion 


935 


$PATH shell variable, Discussion 
adding current directory to, Problem 
adding ~/bin to, Solution 
changing permanently, Solution 
changing temporarily, Problem 
current directory not in, Problem 
finding files on or not on $PATH, Problem 
finding out how $PATH is set, Discussion 
finding world-writable directories in, Problem 
security risk with putting . (dot) in, Discussion 
setting a POSIX $PATH, Problem 
setting secure $PATH, Problem 

pathname expansion, Discussion 

pathnames, finishing with Tab key, Problem 

paths, Solution 
(see also $PATH shell variable) 
absolute, Discussion 
default and POSIX, on several systems, Solution 
listing full path in search for files not on $PATH, Solution 
relative, Discussion 
stored by slocate, Discussion 

pattern matching, Discussion, Searching with More Complex Patterns 
(see also regular expressions) 
characters in bash, Pattern-Matching Characters 


forgetting that it alphabetizes, Problem 


936 


in case statement, Discussion, Discussion, Discussion 
leading and trailing spaces, Solution 
mechanisms performed by, Discussion 
reducing typos in, Problem 
regular expressions and, Discussion 
testing strings with, Problem-Problem 
using to validate input, Solution 
pause command (DOS), emulating, Problem 
PC emulator, Discussion 
Perl, Discussion 
converting a datafile to CSV, Solution 
converting epoch seconds to dates and time, Discussion 
date and time modules, Discussion 
dates and time, converting to epoch seconds, Discussion 
getting yesterday and tomorrow's dates, Problem 
hexdump script, Discussion 
matching IP addresses with regular expressions, Discussion 
using to add prefix or suffix to last command output, Discussion 
using to commify numbers, Discussion 
using to number lines in a file, Discussion 
using to parse a CSV datafile, Solution 
using to parse ifconfig output for IP address, Discussion 
using to process fixed-length records, Solution 
Perl Compatible Regular Expressions (PCRE), Discussion 
permissions, setting, Problem 


pgrep utility, Solution 


937 


phases, using to automate a process, Problem 
usage or summary routine listing phases, Discussion 
photo album, generating, Problem 
PID (see process ID) 
pipelines, Discussion, Command-Line Processing Steps 
forgetting that they make subshells, Problem 
hooking up sort to any program's standard output, Discussion 
search for text in, using grep, Problem 
pluralizing words, Problem 
POD (Plain Old Documentation), Discussion 
Polarhome, Solution 
popd command, Solution 
portable scripts, Advanced Scripting 
(see also scripting, advanced) 
developing, Problem 
using echo portably, Problem 
using for loops portably, Problem 
using virtual machines for testing, Solution 
positional parameters, Discussion 
inside $(( )), Discussion 
shift statement and, Solution 
--posix option, Discussion 
POSIX standard, Why bash? 
differences affecting trap utility, Discussion 


POSIX syntax 


938 


character classes within brackets, Pattern-Matching Characters 
developing portable scripts, Problem 
running bash in POSIX mode, $CDPATH and, Discussion 
setting a POSIX $PATH, Problem 
sorting IP addresses, Solution 

postfix style notation, Discussion 

PowerShell (Windows), Solution 
using, Using PowerShell or other native tools 

predicates, Discussion 

print predicate for find utility, Discussion 

print statement in awk, Discussion 

printf binary executable, Discussion 

printf command, Problem, Discussion, Discussion, Discussion, Discussion 
best practices for, Discussion 
error message for case statement parsing arguments, Discussion 
following read -s command, Discussion 
function for printf statement, Discussion 
newer format supporting date and time values, Discussion 
printf "%ob", Solution 
printf '“o(fmt)T for dates and times, bash 4 or newer, Solution 
printf binary executable versus, Discussion 
reference on, printf-Examples 
seeing odd behavior from, Problem 

too many arguments, Discussion 


strftime format for today, Discussion 


939 


using in awk for loop, Discussion 
using in GNU find to capture file metadata, Solution 
using with awk to show histogram of some data, Discussion 
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shopt -s xpg_echo, Solution 
short circuits, Discussion 
SIGHUP signal, Discussion 
SIGKILL signals, Discussion 
signals 
providing for kill command textual completion, Discussion 


script trapping and responding to, Problem 


952 


SIGTERM signal, Discussion 
-size predicate (find), Solution 
slocate, Discussion, Solution 
Social Security number, searching for, using regular expressions, 
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using with chmod, Solution 
xpg_echo, Solution 
xtrace (debugging) prompt ($PS4), Discussion, Solution 
xtrace, turning on and off, Solution 
xz utility, Solution 
Y 
years, Discussion 
(see also dates and time) 
Unix commands and, Solution 
yes or no input, getting from the user, Problem 
Z 
zcat utility, Solution 
zgrep utility, Solution 
ZIP files, unzipping many in a directory, Problem 


zip utility, Solution, Discussion 
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Colophon 


The animal on the cover of bash Cookbook is a wood turtle (Glyptemys 
insculpta and is named so because its shell looks like it was carved from 
wood. The wood turtle can be found in forests and is very common in North 
America, particularly in Nova Scotia through to the Great Lakes region. The 
wood turtle is an omnivorous and lazy eater; it will eat whatever crosses its 
path, including plants, worms, and slugs (a favorite. But this isn’t to say 
wood turtles are slow—in fact, they can be quite agile and quick to learn. 
Some researchers have seen wood turtles stamping on the ground to mimic 
the sound of raindrops, which lures worms out to their certain death. 


Wood turtles are threatened by human expansion into their territories. They 
nest on the sandy banks of rivers, streams, and ponds, which are prone to 
erosion, damming, and use by outdoor enthusiasts. Roadside fatalities, toxic 
pollution, and the pet trade have also taken a toll on the wood turtle 
population, so much so that in many states and provinces they are considered 
a threatened species. 


Many of the animals on O’Reilly covers are endangered; all of them are 
important to the world. To learn more about how you can help, go to 
animals.oreilly.com. 


The cover image is from Dover Pictoral Archive. The cover fonts are URW 
Typewriter and Guardian Sans. The text font is Adobe Minion Pro; the 
heading font is Adobe Myriad Condensed; and the code font is Dalton 
Maag’s Ubuntu Mono. 
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